[torquedev] clearing exec_host on job requeue

Garrick Staples garrick at clusterresources.com
Wed Feb 14 20:25:18 MST 2007


On Wed, Feb 14, 2007 at 09:37:29PM -0500, Glen Beane alleged:
> On 2/14/07, Garrick Staples <garrick at clusterresources.com> wrote:
> >CRI has a trouble ticket open about a job's exec_host not being cleared
> >when it is requeued.  Apperently this annoys some sysadmins and breaks
> >some 3rd party things like clumon.
> >
> >I think I just found a bug that pre-dates TORQUE and is fixed with a
> >single character patch!  I need others to look at this and tell me I'm
> >not crazy.
> >
> >I've already committed it to trunk, but this is trivial for 2.1 as well.
> >
> >$ svn diff -r1242:1243 src/server/req_jobobit.c
> >Index: src/server/req_jobobit.c
> >===================================================================
> >--- src/server/req_jobobit.c    (revision 1242)
> >+++ src/server/req_jobobit.c    (revision 1243)
> >@@ -1419,7 +1419,7 @@
> >
> >       /* Now re-queue the job */
> >
> >-      if ((pjob->ji_qs.ji_svrflags | JOB_SVFLG_HOTSTART) == 0)
> 
> That's definitely a bug.  whatever is inside that if block is never
> going to be executed since (pjob->ji_qs.ji_svrflags |
> JOB_SVFLG_HOTSTART) can never be zero.  Good eye :)  hard to believe
> how long that has been in there!

Assuming everyone is happy with the results, does this go into
2.1-fixes?  I have a feeling this is going to have some wider-reaching
effects and don't want to break expected behaviour in 2.1.



More information about the torquedev mailing list