[torqueusers] job not deleted due to 15033

Eva Hocks hocks at sdsc.edu
Thu Mar 20 16:37:26 MDT 2014





haven't found out why the qdel didn't work in the first place but I was
able to delete the job via:


# qsig 1277239        sends a signal SIGTERM to the job


maybe you should use the qsig in your glean script,

Job;1277239.tscc-mgr.local;kill_task: killing pid 40350 task 1 with sig 15


-Eva


On Thu, 20 Mar 2014, Eva Hocks wrote:

>
>
>
>
> the qdel for a job is stuck in the server due to No free connections????
>
>
> 03/20/2014 14:00:05;0008;PBS_Server.43460;Job;1277239.tscc-mgr.local;Job deleted at request of root at tscc-mgr.local
> 03/20/2014 14:00:05;0008;PBS_Server.43460;Req;decode_DIS_replySvr;failed to get PROT_TYPE: 0, (rc: 11)
> 03/20/2014 14:00:05;0008;PBS_Server.43460;Job;send_request_to_remote_server;DIS_reply_read failed: 11
> 03/20/2014 14:00:05;0080;PBS_Server.43460;Req;req_reject;Reject reply code=15033(Batch protocol error), aux=0, type=DeleteJob, from root at tscc-mgr.local
> 03/20/2014 14:00:05;0008;PBS_Server.43460;Job;1277239.tscc-mgr.local;Job sent signal SIGTERM on delete
>
>
> Any help appreciated on how to fix that issue and how to delete the job
>
> Thanks
> Eva
>
>



More information about the torqueusers mailing list