[torqueusers] SIGTERM and pbsdsh
garrick at usc.edu
Thu Nov 29 14:43:42 MST 2007
On Tue, Nov 27, 2007 at 09:52:36AM -0600, Tim Freeman alleged:
> I am starting the same executable on N nodes using pbsdsh -n. During a qdel,
> SIGTERM signals do not look like they are propagating to each process, only a
> SIGKILL from the initial looks of it (there's a SIGTERM handler in the
> executable that is not getting invoked).
> The application I'm running greatly benefits from getting to run a cleanup
> routine if cancelled. Is there an option to pbsdsh or some technique to use
> where I can make this happen?
There's 2 common things here. The first is "kill_delay", the queue attribute
that specifies the time between the initial TERM and the later KILL. The
default is too short.
The second is that your top-level shell is catching the TERM signal and
exiting. You need to ignore the TERM in your batch script.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071129/07a76f1d/attachment.bin
More information about the torqueusers