[Mauiusers] Maui - checkpoint and moving execution from node to node

alexleb alexleb at crt.umontreal.ca
Wed Jul 6 10:30:51 MDT 2005


 

Thanks maui forum members for your great help so far.

 

If you want to move a single running job (sequential job for 1 cpu) from
node x to node y that have same arch. How do you use the checkpoint to do
that.

 

I’m not sure I understand how to set the default checkpoint.

If a place –c c=240 in the qmgr or in the scriptjob, I don’t see anything
appear in the checkpoint directory of the node in question after 240
minutes.

 

I understand that if a job is system chekpointable, it will restart after a
node reboot.

 

Any pointers on how to move a running job ?

to verify the checkpoint file of a job ?

how to manually initiate a checkpoint of a job ?

If a user spend more time than is wallcputime, can he get re-queued with the
previous checkpoint and with the same initial time ?

 

Config is torque,maui, suse9.3

 

Best,

 

Alexandre Le Bouthillier

HYPERLINK "mailto:alexleb at crt.umontreal.ca"alexleb at crt.umontreal.ca

Centre de Recherche sur les Transports

C.P. 6128, succ. centre-ville

(UdeM, 2920 Chemin des services #3517)

Montreal (Qc) Canada, H3C 3J7

 


-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.8.9/39 - Release Date: 7/4/2005
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20050706/d4b6bfeb/attachment.html


More information about the mauiusers mailing list