[torqueusers] torque/blcr integration

Robin robinr at muohio.edu
Tue Sep 21 12:13:36 MDT 2010


Hi,

I'm following the instructions on http://www.clusterresources.com/products/torque/docs/2.6jobcheckpoint.shtml
Torque is compiled with --enable-blcr, version 2.4.10, I'm aware that the doc is for 2.5.x, I did not easily find the doc for 2.4.x.

Attached are my mom_priv/{config,epilogue,blcr_checkpoint_script,blcr_restart_script}. It's essentially the scripts from the doc, but the script on the doc needs correction (or it would not run).
blcr_checkpoint_script was editted to declare variable $depth and put a missing comma -- the aim was to fix the syntax (I didn't spend much time on the scripts).
[ It would be nice to see the webpage has the code fixed. ]

I submitted my test job with "qsub -c enabled test.job", then issue qhold jobid. It did not checkpoint the job, under qstat -f, there's an output line for that job:
    comment = "Usage: /usr/local/torque/current/var/spool/torque/mom_priv/blcr_checkpoint_script"

Mom logs say that it the blcr_checkpoint_script exited with code 255, which is consistent with running the script without parameters.        

I take that the pbs_mom did not issue the blcr_checkpoint_script command with all the required parameters.

Any comments, helpful hints, or outright help will be most welcome. 

Thanks,
Robin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blcr_checkpoint_script
Type: application/octet-stream
Size: 2680 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100921/027e4a63/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blcr_restart_script
Type: application/octet-stream
Size: 1983 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100921/027e4a63/attachment-0001.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config
Type: application/octet-stream
Size: 469 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100921/027e4a63/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: epilogue
Type: application/octet-stream
Size: 1043 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100921/027e4a63/attachment-0003.obj 


More information about the torqueusers mailing list