[torqueusers] Re: kill_delay
pw at osc.edu
Tue Feb 27 13:23:04 MST 2007
Roy.Dragseth at cc.uit.no wrote on Tue, 27 Feb 2007 10:50 +0100:
> After some tinkering with the code I've come to the conclusion that the kill
> loop makes a lot of sense for parallel jobs, as you want to give an mpi
> launcher the time to clean up before it is killed with an untrappable signal.
> The loop is only executed on a SIGKILL. The annoying delay should be fixed
> by doing a fork.
Apologies in advance if I'm not paying enough attention to Torque
development lately. The kill loop in mom appears to be the source
of a regression in Torque that affects mpiexec users:
Do things work properly now so that a parallel job launcher gets
the obit signals and can clean up? If so, I'll be happy to remove
that issue from the list. Thanks,
More information about the torqueusers