[torqueusers] Not Running - PBS Error: Premature end of message

Prakash Velayutham prakash.velayutham at cchmc.org
Thu Nov 8 11:41:40 MST 2007


Hi Samir,

If all you want to do is remove this job from the system,

1) Login to the compute node(s) where the job is running and remove  
everything (files and folders) that have the name as in the job id  
from $PBS_HOME/mom_priv/jobs folder.
2) Also make sure to kill any processes that have been started by the  
above job on the nodes.
3) In the headnode, do the same under the folder $PBS_HOME/server_priv/ 
jobs.
4) Restart both the mom (on the relevant nodes) and the server on the  
headnode.

You should not see the job listed anymore and the job is killed.

Prakash

On Nov 7, 2007, at 8:53 PM, Samir Khanal wrote:

> Hi
>
> I submitted a job using QSUB on the PBS
>
> But the message below says that  "Not Running - PBS Error: Premature  
> end of message"
>
> I restarted the server, mom and sched but still i cannot pull it out.
>
> I tried qsig -n SIGNULL jobid , qsig -s SIGKILL jobid, but no success.
>
> I even KILLED the pbs_mom on the defective nodes, started it again  
> and tried the qsig again, but it is still there.
>
> Strangely when i do qdel 23848.bwp4 the prompt doesnot return, as if  
> waiting for some input ? Has anyone come across this probelm?
>
> This job seems to be stuck there forever.
>
> ----------------------------------------------------------------------------------------------------------------------------
> Job ID          Username Queue    Jobname    SessID NDS TSK Memory  
> Time  S Time
> --------------- -------- -------- ---------- ------ --- --- ------  
> ----- - -------------------------------------------------
> 23848.bwp4. skhanal  parallel       my_paralle    --        5  --     
> --             02:40 R   --
>
>   node13/0+node12/0+node11/0+node10/0+node09/0
>    Not Running - PBS Error: Premature end of message
> ----------------------------------------------------------------------------------------------------------------------------
>
>
> Please help, I am stuck.
>
> Samir
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20071108/1f3a4d51/attachment-0001.html


More information about the torqueusers mailing list