[torqueusers] jobs not clearing on crashed node
Andrus, Brian Contractor
bdandrus at nps.edu
Fri Feb 1 08:47:47 MST 2013
Running torque 4.1.4 here (along with moab 7.2.0)
Issue: a node crashes that had several elements of an array job running on it.
It reboots and gets re-provisioned and comes back up.
pbsnodes still claims there are several jobs running on it.
If I run (on the node) pbs_mom purge, nothing changes.
If I restart pbs_server (which I hate doing since it resets Time Used on running jobs), nothing changes.
Shouldn't the jobs automatically either get restarted or cleared if a node reboots? I'm pretty sure torque used to do that...
Naval Postgraduate School
More information about the torqueusers