[torquedev] pbs_server segfault in req_delete.c
Joshua Bernstein
jbernstein at penguincomputing.com
Mon Dec 29 12:03:36 MST 2008
Garrick Staples wrote:
> On Wed, Dec 24, 2008 at 12:23:33AM -0500, Michel Béland alleged:
>> Garrick Staples wrote:
>>
>>> While segfaults need to always be fixed, you are using qdel -p
>>> incorrectly. It
>>> should only be used if a running job will not exit because its allocated
>>> nodes
>>> are unreachable.
>>>
>>> qdel -p is a very bad thing to do. It is intentionally breaking
>>> pbs_server's
>>> idea of what is going on.
>>>
>>> Since you are using qdel -p when you have a running pbs_mom that has the
>>> job,
>>> you are bound to have bad things happen.
>> That is probably true, I will trust you on that, but how to get rid of a
>> job that is stuck in the E state for days?
>
> The solution to that will be on the node. Do you know why it is stuck? What
> is it waiting for?
I know why the job is stuck. I made it get stuck in that phase on
purpose in order to recreate an issue a few customers reported to me.
-Joshua Bernstein
Software Engineer
Penguin Computing
More information about the torquedev
mailing list