[torqueusers] pbs_server failed

Lydia Heck lydia.heck at durham.ac.uk
Wed Mar 30 07:32:17 MDT 2011


The pbs_server failed for no apparent reason. Although configured "High 
availability" did not work as I had forgotten to add the second server.

However there is still the queston why it failed.

When the system was finally brought back to life everything seemed to work fine,
with the exception that jobs with multi-cpu requirements are not being scheduled 
now.

If I stop both servers the pbs_mom daemons will die and take all the jobs with 
it.

Any idea what i could do short of restarting all the daemons and loosing all the 
jobs?

Best wishes,
Lydia



More information about the torqueusers mailing list