[torqueusers] Re: kill_delay

Roy Dragseth roy.dragseth at cc.uit.no
Tue Feb 27 14:09:12 MST 2007


On Tuesday 27 February 2007, Garrick Staples wrote:
> Adding yet another mom config parameter makes me feel icky.

I can agree on that.

>
> Can we just get rid of the mom's SIGKILL since we know we get one from
> server if necessary?

Probably a good idea, but who is initiating the SIGKILL eventually?  The 
scheduler or pbs_server?  I set kill_delay on the server to 1 and 
mom_kill_delay to 3600, my interactive 30second test job ran for ten minutes:

royd at newton ~]$ date ; qsub -lwalltime=30,nodes=1 -I; date
Tue Feb 27 21:30:24 CET 2007
qsub: waiting for job 18.newton.cc.uit.no to start
qsub: job 18.newton.cc.uit.no ready

[royd at newton ~]$ =>> PBS: job killed: walltime 62 exceeded limit 30

qsub: job 18.newton.cc.uit.no completed
Tue Feb 27 21:41:03 CET 2007

The logs indicate that maui takes care of initiating the SIGKILL, is the time 
it takes to send it configurable somewhere?

r.

-- 

  The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
              phone:+47 77 64 41 07, fax:+47 77 64 41 00
     Roy Dragseth, High Performance Computing System Administrator
         Direct call: +47 77 64 62 56. email: royd at cc.uit.no


More information about the torqueusers mailing list