[torqueusers] Errors setting up torque

Carolyn Sawyer csawyer at berkeley.edu
Fri Nov 30 11:55:02 MST 2012


I am trying to set up Torque on a Fedora 15 machine (one node, 48 cores) 
for my research group. I installed it using "yum torque". The version 
installed appears to be 3.0.3. I've tried two methods: following the 
instructions in README.Fedora, and following online instructions; any 
time I try to run a qmgr instruction, I get a "qmgr: cannot connect to 
server  (errno=110) Connection timed out" error.

README.Fedora method: Following the instructions, I created a munge key 
with "/usr/sbin/create-munge-key". My hostname, using "/bin/hostname 
--long", is slacker.berkeley.edu. I edited /etc/torque/server_name to 
have "slacker.berkeley.edu" as its full contents. I edited 
/etc/torque/mom/config to have "$pbsserver slacker.berkeley.edu" as its 
only contents. I ran "/usr/sbin/pbs_server -D -t create", hit "y" to 
continue, received the "pbs_server is up" message, and hit Ctrl-C (per 
instructions). I did "service pbs_server start" and got a message 
"Starting pbs_server (vis systemctl):  [OK]". I then try "qmgr -c "s s 
scheduling=true"" and it sits for a while and then spits out "Connection 
time out/qmgr: cannot connect to server  (errno=110) Connection timed 
out". Any qmgr command does the same thing, and if I skip ahead to 
"service pbs_sched start" it fails.

I also tried following online instructions and running torque.setup, 
which is in /usr/share/doc/torque-3.0.3. This demands a username, so I 
tried "./torque.setup root" since I am running as root. It responds with 
"PBS_Server slacker.berkeley.edu: Create mode and server database 
exists, do you wish to continue y/(n)?" so I hit y. It sits for a while 
and then spits out "Connection timed out/qmgr: cannot connect to server  
(errno=110) Connection timed out/ERROR: cannot set TORQUE admins", then 
sits a while longer, gives another timeout error, and drops me back to 
terminal. At this point ps shows that "pbs_server -t create" is running, 
but it never finishes.

Any thoughts? The fixes I saw mentioned in the list archive for this 
error message all seemed to require qmgr commands, which don't work for 

Carolyn Sawyer

More information about the torqueusers mailing list