[torquedev] torque+blcr+openmpi

Danny Sternkopf dsternkopf at hpce.nec.com
Tue Jul 6 03:54:47 MDT 2010


ah I see. But that would mean this concept can't be used for qhold/qrls, 
right? You always have to reschedule the job.

How does your use case look exactly? You submit a job, then you run 
qhold, then you run qdel, then you resubmit the job?



On 7/6/2010 11:54 AM, Peter Kruse wrote:
> Hi Rishi,
> rishi pathak wrote:
>> Hi Danny,
>>                    Is there a need for checkpointing mpirun/mpiexec
>> processes(Please correct me if I am wrong). They are spawning MPI program on
>> defined nodes. For restarting a checkpointed MPI program, a fresh instance
>> of mpirun, mpiexec or pbsdsh can be used.
> exactly, this is how we use and see it.  You submit a new job but with
> the same node geometry and can then restart the same job.
> Regards,
> Peter
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev

More information about the torquedev mailing list