[torqueusers] kill_delay in torque 2.5.9

Chris Berthiaume chrisbee at uw.edu
Tue Mar 20 15:17:51 MDT 2012


Hello,

I'm trying to get an extended kill_delay working with torque 2.5.9, but so
far I haven't been able to exceed a 5 second delay between SIGTERM and
SIGKILL.  After reading various mailing list entries it looks like this
issue has been encountered in the past and with 2.5.9 it should be possible
to set a longer kill_delay.  Here's how I've configured pbs_server and
pbs_mom.

$ qmgr -c 'print queue gross'
#
# Create queues and set their attributes.
#
#
# Create and define queue gross
#
create queue gross
set queue gross queue_type = Execution
set queue gross resources_default.neednodes = gross
set queue gross kill_delay = 30
set queue gross enabled = True
set queue gross started = True

$ cat /opt/torque/mom_priv/config
$ignwalltime false
$kill_delay true

To test these settings I run a submit script that traps SIGTERM and in that
trap prints the date every second.  Then I issue a qdel for this job.  Only
5 seconds worth of date output from the SIGTERM trap function appears.  Is
there anything more I need to do to enable kill_delay?  I gather it's
pbs_mom which is subverting the server kill_delay and sending SIGKILL to
the job after 5 seconds, but the undocumented mom config option
"$kill_delay true" should override this.  Here's my submit script.

#!/bin/bash
function termtrap() {
    while true; do
        date
        sleep 1
    done
}

trap termtrap SIGTERM
sleep 600

Thanks,
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120320/a2b1ac34/attachment-0001.html 


More information about the torqueusers mailing list