[torqueusers] Torque not deleting job

Garrick Staples garrick at clusterresources.com
Fri Apr 20 09:13:28 MDT 2007


On Thu, Apr 19, 2007 at 11:41:59AM -0500, Adam Emerich alleged:
> 
> I am seeing a case in which torque does not delete an interactive job if
> the node on which the job is running goes down.  Here is what I am doing:
> 
>    qsub -I -l nodes=n01-01-06:ppn=1       -> successfully returns a prompt
>    on the machine requested
> 
> Then the node (n01-01-06) is reboot.  After the reboot "top" on n01-01-06
> does not show any jobs being run by my userid.  However, "showq" shows the
> following on the torque server:

Is pbs_mom being started with the -r option at boot?

Can you check in server_log to see if an epilogue came and was rejected?

Does 'qsig -s 0 1131' cause the job to exit?



More information about the torqueusers mailing list