[torqueusers] Torque not deleting job
Garrick Staples
garrick at clusterresources.com
Fri Apr 20 09:13:28 MDT 2007
On Thu, Apr 19, 2007 at 11:41:59AM -0500, Adam Emerich alleged:
>
> I am seeing a case in which torque does not delete an interactive job if
> the node on which the job is running goes down. Here is what I am doing:
>
> qsub -I -l nodes=n01-01-06:ppn=1 -> successfully returns a prompt
> on the machine requested
>
> Then the node (n01-01-06) is reboot. After the reboot "top" on n01-01-06
> does not show any jobs being run by my userid. However, "showq" shows the
> following on the torque server:
Is pbs_mom being started with the -r option at boot?
Can you check in server_log to see if an epilogue came and was rejected?
Does 'qsig -s 0 1131' cause the job to exit?
More information about the torqueusers
mailing list