[Mauiusers] the problem of testing checkpoint/restart with torque
2.4.0 and maui 3.2.6p19
qindm at dawning.com.cn
qindm at dawning.com.cn
Wed Sep 17 16:14:17 MDT 2008
HI, everyone
I used blcr-0.7.3 and torque torque-2.4.0-snap.200809111541.tar.gz to
test the checkpoint/restart function according to
the wiki:
http://www.clusterresources.com/wiki/doku.php?id=torque:2.6_job_checkpoint_and_restart
I found an insteresting question, when I qhold the job, I'll see the
checkpoint file located at
/var/spool/torque/checkpoint/4817.node24.CK/ckpt.4817.node24.1221666102
but when I qrls the same job 4817, the pbs_mom daemon at the compute node
will down (killed by something). Any clues?
Thank you very much.
dolphin ,qin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20080918/465e51bb/attachment.html
More information about the mauiusers
mailing list