[Mauiusers] the problem of testing checkpoint/restart with torque 2.4.0 and maui 3.2.6p19

qindm at dawning.com.cn qindm at dawning.com.cn
Wed Sep 17 16:14:17 MDT 2008


HI,  everyone

I used blcr-0.7.3 and torque torque-2.4.0-snap.200809111541.tar.gz  to 
test the checkpoint/restart function according to 

the wiki: 
http://www.clusterresources.com/wiki/doku.php?id=torque:2.6_job_checkpoint_and_restart


I found an insteresting question, when I qhold the job, I'll see the 
checkpoint file located at 
/var/spool/torque/checkpoint/4817.node24.CK/ckpt.4817.node24.1221666102 

but when I qrls the same job 4817, the pbs_mom daemon at the compute node 
will down (killed by something).  Any clues? 

Thank you very much.


dolphin ,qin 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20080918/465e51bb/attachment.html


More information about the mauiusers mailing list