[torqueusers] Not Running - PBS Error: Premature end of message
Prakash Velayutham
prakash.velayutham at cchmc.org
Thu Nov 8 11:41:40 MST 2007
Hi Samir,
If all you want to do is remove this job from the system,
1) Login to the compute node(s) where the job is running and remove
everything (files and folders) that have the name as in the job id
from $PBS_HOME/mom_priv/jobs folder.
2) Also make sure to kill any processes that have been started by the
above job on the nodes.
3) In the headnode, do the same under the folder $PBS_HOME/server_priv/
jobs.
4) Restart both the mom (on the relevant nodes) and the server on the
headnode.
You should not see the job listed anymore and the job is killed.
Prakash
On Nov 7, 2007, at 8:53 PM, Samir Khanal wrote:
> Hi
>
> I submitted a job using QSUB on the PBS
>
> But the message below says that "Not Running - PBS Error: Premature
> end of message"
>
> I restarted the server, mom and sched but still i cannot pull it out.
>
> I tried qsig -n SIGNULL jobid , qsig -s SIGKILL jobid, but no success.
>
> I even KILLED the pbs_mom on the defective nodes, started it again
> and tried the qsig again, but it is still there.
>
> Strangely when i do qdel 23848.bwp4 the prompt doesnot return, as if
> waiting for some input ? Has anyone come across this probelm?
>
> This job seems to be stuck there forever.
>
> ----------------------------------------------------------------------------------------------------------------------------
> Job ID Username Queue Jobname SessID NDS TSK Memory
> Time S Time
> --------------- -------- -------- ---------- ------ --- --- ------
> ----- - -------------------------------------------------
> 23848.bwp4. skhanal parallel my_paralle -- 5 --
> -- 02:40 R --
>
> node13/0+node12/0+node11/0+node10/0+node09/0
> Not Running - PBS Error: Premature end of message
> ----------------------------------------------------------------------------------------------------------------------------
>
>
> Please help, I am stuck.
>
> Samir
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20071108/1f3a4d51/attachment-0001.html
More information about the torqueusers
mailing list