[torqueusers] Torque/maui node failure policy revisted again

Charles at Schwieters.org Charles at Schwieters.org
Tue Dec 16 10:26:09 MST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>Glen Beane <glen.beane at gmail.com> wrote:
>
>> > One problem comes to mind and that is what happens if the first node in the
>> > list (exec_host) is the node that goes MIA? Would this mean that torque
>> > would loose control and tracking of the running jobs?
>> > Perhaps if any node but the first node fails then the job should continue to
>> > run, if the first node dies the job should be terminated???
>> 
>> Yes, even if we implemented this feature if the first node dies the
>> job will be terminated.  And like I said I think this should be a per
>> job option that defaults to the current behavior of terminating the
>> job if any node is lost.  It should also be possible to set the
>> default on a server or queue basis.

This behavior would be wonderful.

thanks--
Charles
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8+ <http://mailcrypt.sourceforge.net/>

iD8DBQFJR+SxPK2zrJwS/lYRAqw7AJ0fVU/4Hm5AkgDihkOmBz0B9WXIHwCdG/Fe
kSLYbtZ1/nA6GSSOzcU3j3Q=
=84Zn
-----END PGP SIGNATURE-----


More information about the torqueusers mailing list