[torqueusers] Re: kill_delay
roy.dragseth at cc.uit.no
Tue Feb 27 14:09:12 MST 2007
On Tuesday 27 February 2007, Garrick Staples wrote:
> Adding yet another mom config parameter makes me feel icky.
I can agree on that.
> Can we just get rid of the mom's SIGKILL since we know we get one from
> server if necessary?
Probably a good idea, but who is initiating the SIGKILL eventually? The
scheduler or pbs_server? I set kill_delay on the server to 1 and
mom_kill_delay to 3600, my interactive 30second test job ran for ten minutes:
royd at newton ~]$ date ; qsub -lwalltime=30,nodes=1 -I; date
Tue Feb 27 21:30:24 CET 2007
qsub: waiting for job 18.newton.cc.uit.no to start
qsub: job 18.newton.cc.uit.no ready
[royd at newton ~]$ =>> PBS: job killed: walltime 62 exceeded limit 30
qsub: job 18.newton.cc.uit.no completed
Tue Feb 27 21:41:03 CET 2007
The logs indicate that maui takes care of initiating the SIGKILL, is the time
it takes to send it configurable somewhere?
The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
phone:+47 77 64 41 07, fax:+47 77 64 41 00
Roy Dragseth, High Performance Computing System Administrator
Direct call: +47 77 64 62 56. email: royd at cc.uit.no
More information about the torqueusers