[torqueusers] Errors setting up torque

Carolyn Sawyer csawyer at berkeley.edu
Fri Nov 30 11:55:02 MST 2012


Hi,

I am trying to set up Torque on a Fedora 15 machine (one node, 48 cores) 
for my research group. I installed it using "yum torque". The version 
installed appears to be 3.0.3. I've tried two methods: following the 
instructions in README.Fedora, and following online instructions; any 
time I try to run a qmgr instruction, I get a "qmgr: cannot connect to 
server  (errno=110) Connection timed out" error.

README.Fedora method: Following the instructions, I created a munge key 
with "/usr/sbin/create-munge-key". My hostname, using "/bin/hostname 
--long", is slacker.berkeley.edu. I edited /etc/torque/server_name to 
have "slacker.berkeley.edu" as its full contents. I edited 
/etc/torque/mom/config to have "$pbsserver slacker.berkeley.edu" as its 
only contents. I ran "/usr/sbin/pbs_server -D -t create", hit "y" to 
continue, received the "pbs_server is up" message, and hit Ctrl-C (per 
instructions). I did "service pbs_server start" and got a message 
"Starting pbs_server (vis systemctl):  [OK]". I then try "qmgr -c "s s 
scheduling=true"" and it sits for a while and then spits out "Connection 
time out/qmgr: cannot connect to server  (errno=110) Connection timed 
out". Any qmgr command does the same thing, and if I skip ahead to 
"service pbs_sched start" it fails.

I also tried following online instructions and running torque.setup, 
which is in /usr/share/doc/torque-3.0.3. This demands a username, so I 
tried "./torque.setup root" since I am running as root. It responds with 
"PBS_Server slacker.berkeley.edu: Create mode and server database 
exists, do you wish to continue y/(n)?" so I hit y. It sits for a while 
and then spits out "Connection timed out/qmgr: cannot connect to server  
(errno=110) Connection timed out/ERROR: cannot set TORQUE admins", then 
sits a while longer, gives another timeout error, and drops me back to 
terminal. At this point ps shows that "pbs_server -t create" is running, 
but it never finishes.

Any thoughts? The fixes I saw mentioned in the list archive for this 
error message all seemed to require qmgr commands, which don't work for 
me...

Thanks,
Carolyn Sawyer


More information about the torqueusers mailing list