[torqueusers] Can i control if the jobs dies or not??

Leandro leotavaneiro at gmail.com
Thu Aug 11 05:22:34 MDT 2005


Thank you for the information. I will test it and any news i will reply to 
you.

This patch is for the latest snapshot?

Regards,

-- 
Leandro Tavares Carneiro
Analista de Suporte Linux/Unix 

2005/8/10, Garrick Staples <garrick at usc.edu>:
> 
> On Wed, Aug 10, 2005 at 06:13:32PM -0700, Garrick Staples alleged:
> > On Wed, Aug 10, 2005 at 08:23:24AM -0300, Leandro alleged:
> > > behavior of PBS/Torque is kill the job when a node dies. Can i change 
> this
> > > behavior? If there's no way to do tha with some kind of configuration, 
> can
> > > someone point me in the code where i can work on this?
> >
> > At this point in time, the MOM on the execution node (MS) will always 
> kill the
> > job if a sister MOM isn't replying.
> >
> > MS sends IM_POLL_JOB messages to sisters. When a sister isn't replying, 
> MS
> > closes the connection with mom_comm.c:im_eof() which calls
> > mom_comm.c:node_bailout(). With outstanding IM_POLL_JOB messages,
> > node_bailout() sets "pjob->ji_nodekill = np->hn_node;" and
> > mom_main.c:job_over_limit() kills the job if "pjob->ji_nodekill !=
> > TM_ERROR_NODE".
> 
> I haven't tried this yet, but this should do the trick:
> 
> --- src/resmom/mom_comm.c_orig 2005-07-26 23:24:55.000000000 -0700
> +++ src/resmom/mom_comm.c 2005-08-10 19:25:45.000000000 -0700
> @@ -1101,8 +1101,6 @@ void node_bailout(
> 
> log_err(-1,id,log_buffer);
> 
> - pjob->ji_nodekill = np->hn_node;
> -
> break;
> 
> case IM_GET_TID:
> 
> 
> --
> Garrick Staples, Linux/HPCC Administrator
> University of Southern California
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20050811/bf277dea/attachment.html


More information about the torqueusers mailing list