[torqueusers] Torque not deleting job

Garrick Staples garrick at clusterresources.com
Mon Apr 23 15:05:02 MDT 2007


On Sun, Apr 22, 2007 at 07:59:39PM +1000, Chris Samuel alleged:
> On Sun, 22 Apr 2007, Chris Engel wrote:
> 
> > Hi, I work with Adam on this cluster and thought I would provide some
> > additional info
> 
> Hello Chris,
> 
> > On 4/21/07, Chris Samuel <csamuel at vpac.org> wrote:
> > > Interesting - anything in the pbs_mom logs on the node about that job ?
> >
> > These nodes are diskless booted, so no state information is retained
> > on the node after a reboot
> 
> That's OK, I'm more curious about what the logs say about that job (or any 
> other errors) after it's rebooted.. 
> 
> Actually, the fact that there is no state information retained on the node 
> means that there should be no way it could be related to the pbs_mom, rather 
> that the pbs_server isn't getting the message that the job no longer exists.

If there is no state information, then pbs_server will never get such a
message.  This is likely the cause of the problem.



More information about the torqueusers mailing list