[torqueusers] pbs_server failed

Ken Nielson knielson at adaptivecomputing.com
Wed Mar 30 08:47:57 MDT 2011


Tell us more. Which version of TORQUE are you running?

Did you use --with-high-availability when you configured TORQUE?

Do your servers use a shared file system for $TORQUEHOME?


----- Original Message -----
From: "Lydia Heck" <lydia.heck at durham.ac.uk>
To: "Torque Users Mailing List" <torqueusers at supercluster.org>
Sent: Wednesday, March 30, 2011 7:32:17 AM
Subject: [torqueusers] pbs_server failed

The pbs_server failed for no apparent reason. Although configured "High 
availability" did not work as I had forgotten to add the second server.

However there is still the queston why it failed.

When the system was finally brought back to life everything seemed to work fine,
with the exception that jobs with multi-cpu requirements are not being scheduled 

If I stop both servers the pbs_mom daemons will die and take all the jobs with 

Any idea what i could do short of restarting all the daemons and loosing all the 

Best wishes,

torqueusers mailing list
torqueusers at supercluster.org

More information about the torqueusers mailing list