[torqueusers] Job stuck at E state forever
abhig at Princeton.EDU
Tue Mar 31 00:08:22 MDT 2009
Some time ago I had the same problem and some people come up with the
problem saying that its because of rcp implementation in PBS which can
be changed, but I don't know how to change it tp cp or scp which can
probably solve the issue. If you have any idea about it, please let me know.
Halvor Utby wrote:
> Abhishek Gupta wrote:
>> Hi all,
>> Some of the running jobs after running, reaches the E state and got
>> stuck there forever. Could someone tell me the reason for that and
>> how to solve this problem?
> Do a "pbsnodes -l" and see if the nodes running the "E jobs" are
> unavailable/down. I would guess they are, and as soon as you have
> started pbs on these nodes, the jobs will disappear from your queue.
> "qdel -p jobnumber" will also purge the job from your queue, but
> should only be used if the node can not be made available again.
More information about the torqueusers