[torqueusers] Job suspend basic question
Ricky Tang Siu Hong
shtang at clustertech.com
Thu Aug 18 21:04:25 MDT 2005
I'm not clear about job suspend/resume discussed in the list earlier. Can
you see if my understand is correct?
We can suspend an MPI job by distribute a signal SIGSTOP. All processes in
nodes will be put in sleep state and not occupy any CPU cycle. Their memory
is still keep, but will probably swap out when another job runs. Then we
can start another urgent job. When urgent job is finished, we can resume
the job again, and most importantly, the suspended job continued at the
point it is suspended. There is no need to be restarted in any cases.
Please point out if anything wrong.
And, what is the standard way to suspend/resume a PBS MPI job, using latest
version of torque and maui?
Cluster Technology Limited
More information about the torqueusers