[torqueusers] Phantom job ?

Garrick Staples garrick at usc.edu
Thu Aug 30 10:02:40 MDT 2007


On Thu, Aug 30, 2007 at 11:18:57AM +0200, Albert Shih alleged:
> Hi (again)
> 
> When I use 
> 
> 	qmgr
> 	list node one_of_my_node
> 
> I've got :
> 
> 	netload=1579477903065,state=free,jobs=921.name_of_my_server,
> 
> What the jobs mean ? 
> 
> I've ask this because sometime (like now) when I list all job with qstat -a
> I don't find some job I can see with qmgr-list node, here the job 921.
> don't exist in the output of qstat -a
> 
> If it's a problem how can I fix it ?

This happens sometimes when a sister node is unreachable, usually because it is
bogged down swapping, when a job exit commands come from the MS.

'set server mom_job_sync = True' should keep track of that better.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20070830/95b55c0e/attachment.bin


More information about the torqueusers mailing list