[torqueusers] checkpoint/restart mpi-job on different compute nodes
TingtingYang
ytt515 at yahoo.cn
Mon Jul 23 02:34:37 MDT 2012
hi all: I want to use torque's checkpoint/restart function.I wonder if it possible to checkpoint/restart mpi jobs which run on multi-nodes with -l nodes=2:ppn=2. right now I can checkpoint/restart mpi jobs which run on one node with -l nodes=1;ppn=4 and counter some error when I try to checkpoint/restart mpi jobs running on multi-nodesthank you
Tingting Yang from Beihang university
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120723/9c8090e8/attachment.html
More information about the torqueusers
mailing list