[torqueusers] Re: mom segfault in new diag code

Chris Samuel csamuel at vpac.org
Sun Oct 31 15:58:32 MST 2004


On Mon, 1 Nov 2004 09:47 am, Garrick Staples wrote:

> > We *always* run the mom's with the -p flag for just this reason. :-)
>
> And you can reliably not break jobs?   If I went through the entire cluster
> and restarted every mom, I know I'll lose half the jobs.

Ahh, our PBS scripts method for stopping the mom on compute nodes is:

 kill -9

That way it doesn't get time to think about what it's going to do to the jobs 
running on its node.. ;-)

-- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20041101/73a4a95f/attachment.bin


More information about the torqueusers mailing list