[torqueusers] Question to Torque community regarding display of completed jobs in qstat

Craig Tierney - NOAA Affiliate craig.tierney at noaa.gov
Mon Dec 3 12:27:49 MST 2012


On Mon, Dec 3, 2012 at 12:12 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:

> On 12/02/2012 01:24 PM, Craig Tierney - NOAA Affiliate wrote:
> > Hello all,
> >
> > I have a question for Torque users regarding the display of completed
> > jobs in qstat.  Do others find the listing of completed jobs by default
> > via qstat makes finding things in the output much more difficult and
> > completely unnecessary?  Having the completed jobs in qstat can
> > significantly slow down qstat if you have a lot (thousands) of completed
> > jobs which is another hassle.
> >
> > I asking this because I need to be able to get error codes from
> > completed jobs (for minutes to hours after completion).  To do this,
> > they have to still be in the queue.  This function is very important,
> > but not to anyone who runs qstat by hand.  Grid Engine had a way to get
> > completed jobs, but only when asked for.
> >
> > Thanks,
> > Craig
> >
>
> Hi Craig
>
> Well, we keep the completed jobs on the queue for a several hours,
> qmgr -c 'set server keep_completed = ...'
> Users here never complained, and seem to like
> to see queued, running, and completed jobs.
> The old/default time of 1200 seconds was too short.
> However, our clusters and the number of users are small,
> nothing like Zeus, so the clutter caused by keeping completed
> jobs on the queue for hours is not large.
> Would 'qstat -u username' or some other filtering
> help the annoyed users?
>
>
Gus,

We currently have the keep_completed to only 600 seconds, and that is too
short.  We are running about 40k-50k jobs a day.  While using -u username
would help, it still seems unnecessary.  The jobs are not evenly
distributed between users.  Some will hundreds in a single workflow (which
would be over a few hours).
I don't mind retraining users (ex: use the -u), but the first thing I would
do as a user would be write a wrapper to hide them, so I figure a better
solution is in order.

But breaking existing functionality is not usually a good idea which is why
I was looking for opinions.  I already have a small patch that removes the
completed jobs, but added -c to show the completed jobs in case you care.
 But if the solution isn't generally acceptable, I don't want to be
patching my code all the time.

Craig



> Gus Correa
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20121203/f3538d5c/attachment.html 


More information about the torqueusers mailing list