[Mauiusers] Maui - checkpoint and moving execution from node to node
alexleb
alexleb at crt.umontreal.ca
Wed Jul 6 10:30:51 MDT 2005
Thanks maui forum members for your great help so far.
If you want to move a single running job (sequential job for 1 cpu) from
node x to node y that have same arch. How do you use the checkpoint to do
that.
I’m not sure I understand how to set the default checkpoint.
If a place –c c=240 in the qmgr or in the scriptjob, I don’t see anything
appear in the checkpoint directory of the node in question after 240
minutes.
I understand that if a job is system chekpointable, it will restart after a
node reboot.
Any pointers on how to move a running job ?
to verify the checkpoint file of a job ?
how to manually initiate a checkpoint of a job ?
If a user spend more time than is wallcputime, can he get re-queued with the
previous checkpoint and with the same initial time ?
Config is torque,maui, suse9.3
Best,
Alexandre Le Bouthillier
HYPERLINK "mailto:alexleb at crt.umontreal.ca"alexleb at crt.umontreal.ca
Centre de Recherche sur les Transports
C.P. 6128, succ. centre-ville
(UdeM, 2920 Chemin des services #3517)
Montreal (Qc) Canada, H3C 3J7
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.8.9/39 - Release Date: 7/4/2005
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20050706/d4b6bfeb/attachment.html
More information about the mauiusers
mailing list