[torqueusers] Re: kill_delay

Garrick Staples garrick at clusterresources.com
Tue Feb 27 17:29:07 MST 2007


On Wed, Feb 28, 2007 at 09:29:47AM +1100, Chris Samuel alleged:
> On Wed, 28 Feb 2007, Garrick Staples wrote:
> 
> > But why loop over SIGKILL? ?It is untrappable and will eventually be
> > delivered even if the process is stuck in IO wait.
> 
> Should eventually be delivered, but certainly there has been a Linux kernel 
> bug in the past that could cause the process not to receive SIGKILL.. :-(
> 
> http://lkml.org/lkml/2005/5/23/22
> 
> >>>we experienced some interesting behaviour with an out of
> >>>memory condition caused by signal handling (on s390x).
> >>>The following program ran our system in an OOM situation
> >>>and couldn't be killed because the SIGKILL signal couldn't
> >>>be delivered.

Well, let's not do obnoxious stuff in TORQUE to work-around a fixed bug
from almost 2 years ago.



More information about the torqueusers mailing list