[torqueusers] Re: mom segfault in new diag code
Chris Samuel
csamuel at vpac.org
Sun Oct 31 15:58:32 MST 2004
On Mon, 1 Nov 2004 09:47 am, Garrick Staples wrote:
> > We *always* run the mom's with the -p flag for just this reason. :-)
>
> And you can reliably not break jobs? If I went through the entire cluster
> and restarted every mom, I know I'll lose half the jobs.
Ahh, our PBS scripts method for stopping the mom on compute nodes is:
kill -9
That way it doesn't get time to think about what it's going to do to the jobs
running on its node.. ;-)
--
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20041101/73a4a95f/attachment.bin
More information about the torqueusers
mailing list