[torqueusers] Job Nanny Poll
dbeer at adaptivecomputing.com
Mon Nov 21 14:01:48 MST 2011
----- Original Message -----
> Hi David,
> > Just a quick poll question - do people use the job delete nanny
> > functionality in TORQUE? If you do, in qmgr you would have the
> > line:
> > set job_nanny = True
> > I'm curious how many people are using it - this seems like very
> > repetitive functionality to me (pbs_mom does pretty much the same
> > thing already) and I personally think job_force_cancel_time is
> > better, but I may be biased.
> We are using it on all our clusters to avoid mail flooding. How does
> pbs_mom do the same thing? I had never heard of job_force_cancel_time
> before. I guess that you want the job to be cancelled before there is
> any mail flooding, don't you?
I was mistaken when I was saying pbs_mom does the same thing. I was actually thinking of another server parameter that does a similar thing - kill_delay, and somehow I was thinking the mom does this by default. We actually don't recommend using kill_delay because it is difficult to set up correctly (most people forget to make the shell catch the signal as well, so their job dies unexpectedly).
I guess one of the things I'm thinking is that it is almost always communication problem if pbs_server can't delete a job, so I think job_force_cancel_time is better. However, I recognize that this may well not be the case for everyone.
Direct Line: 801-717-3386 | Fax: 801-717-3738
1712 S East Bay Blvd, Suite 300
Provo, UT 84606
More information about the torqueusers