[torqueusers] Job suspend basic question

Ricky Tang Siu Hong shtang at clustertech.com
Thu Aug 18 21:04:25 MDT 2005


Dear all,

I'm not clear about job suspend/resume discussed in the list earlier.  Can 
you see if my understand is correct?

We can suspend an MPI job by distribute a signal SIGSTOP.  All processes in 
nodes will be put in sleep state and not occupy any CPU cycle.  Their memory 
is still keep, but will probably swap out when another job runs.  Then we 
can start another urgent job.  When urgent job is finished, we can resume 
the job again, and most importantly, the suspended job continued at the 
point it is suspended.  There is no need to be restarted in any cases.

Please point out if anything wrong.

And, what is the standard way to suspend/resume a PBS MPI job, using latest 
version of torque and maui?

Thanks

Ricky Tang
Cluster Technology Limited 



More information about the torqueusers mailing list