[torquedev] Jobs remain in queue after process completion in
wightman at clusterresources.com
Wed Nov 7 10:57:46 MST 2007
For my part, I fixed the problem I was seeing in my local testing by
changing src/resmom/catch_child.c in "scan_for_exiting".
I changed ObitsAllowed to 1 and all the problems went away. It seems
that any obits besides the first are ignored by the pbs_server and so
the job will never go away without a "qdel -p".
On Wed, 2007-11-07 at 09:42 -0800, Garrick Staples wrote:
> On Wed, Nov 07, 2007 at 09:34:01AM -0700, Steve Snelgrove alleged:
> > It has been noticed that jobs remain in the queue after the process has
> > finished. This seems to be associated with recently checked-in code
> > having to do with the packing of multiple obits going to pbs_server.
> > If we force the number obits in a packet to one, everything seems to be
> > fine.
> > Any suggestions about this would be appreciated.
> I'm not recalling anything that matches this description. What exctly did you
> change to make it work?
> torquedev mailing list
> torquedev at supercluster.org
More information about the torquedev