[torqueusers] SIGTERM and pbsdsh

Garrick Staples garrick at usc.edu
Thu Nov 29 14:43:42 MST 2007

On Tue, Nov 27, 2007 at 09:52:36AM -0600, Tim Freeman alleged:
> I am starting the same executable on N nodes using pbsdsh -n.  During a qdel,
> SIGTERM signals do not look like they are propagating to each process, only a
> SIGKILL from the initial looks of it (there's a SIGTERM handler in the
> executable that is not getting invoked).
> The application I'm running greatly benefits from getting to run a cleanup
> routine if cancelled.  Is there an option to pbsdsh or some technique to use
> where I can make this happen? 

There's 2 common things here.  The first is "kill_delay", the queue attribute
that specifies the time between the initial TERM and the later KILL.  The
default is too short.

The second is that your top-level shell is catching the TERM signal and
exiting.  You need to ignore the TERM in your batch script.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071129/07a76f1d/attachment.bin

More information about the torqueusers mailing list