[torquedev] pbs_server segfault in req_delete.c

Garrick Staples garrick at usc.edu
Wed Dec 24 00:20:23 MST 2008


On Wed, Dec 24, 2008 at 12:23:33AM -0500, Michel Béland alleged:
> Garrick Staples wrote:
> 
> >While segfaults need to always be fixed, you are using qdel -p 
> >incorrectly.  It
> >should only be used if a running job will not exit because its allocated 
> >nodes
> >are unreachable.
> >
> >qdel -p is a very bad thing to do.  It is intentionally breaking 
> >pbs_server's
> >idea of what is going on.  
> >
> >Since you are using qdel -p when you have a running pbs_mom that has the 
> >job,
> >you are bound to have bad things happen.
> 
> That is probably true, I will trust you on that, but how to get rid of a 
> job that is stuck in the E state for days?

The solution to that will be on the node.  Do you know why it is stuck?  What
is it waiting for?

> 
> Maybe we need to allow neatly deleting the job with a simple qdel when a 
> job is in this state.

If a simple qdel worked, you wouldn't need a qdel :)

-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

See the Dishonor Roll at http://www.californiansagainsthate.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20081223/bf98799c/attachment.bin


More information about the torquedev mailing list