[Mauiusers] SIGSTOP and SIGTSTP don't work

Mahmood Naderan nt_mahmood at yahoo.com
Sat Jan 15 10:55:29 MST 2011


>There is a Torque command named 'qhold' which will hold running Torque jobs.
>Try 'man qhold'.
See the output please:

mahmood at server:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
344.server                   job1            mahmood         62:51:59 R slow    
817.server                   job2     mahmood         02:13:20 R slow      
  

mahmood at server:~$ qhold 817

 
mahmood at server:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
344.server                   job1            mahmood         62:51:59 R 
slow        

817.server                   job2     mahmood         02:13:51 R slow        

 
mahmood at server:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
344.server                   job1            mahmood         62:51:59 R 
slow        

817.server                   job2     mahmood         02:14:36 R slow   
     


I ran the last command one minute later and as you can see, still it is running. 
Also the "top" command shows the process is active.

>With qhold/qrls, you're telling a running job 
>to checkpoint and vacate the node (if supported) and enter the HELD 
>state, or for a queued job to enter a HELD state and not be scheduled 
>for execution until it's been qrls'd to the QUEUED state.
Seems that it is not supported on my job manager. qhold doesn't have any effect.

 
>mjobctl -s jobID  # suspend
>mjobctl -r jobID  # resume
Thanks for that. It really suspend the job:

mahmood at server:~$ mjobctl -s 817
job 817 successfully preempted
 
mahmood at server:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
344.server                   job1            mahmood         62:51:59 R slow    
817.server                   job2     mahmood         02:12:47 S slow    

mahmood at server:~$ mjobctl -r 817
cannot resume non-suspended job

mahmood at server:~$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
344.server                   job1            mahmood         62:51:59 R slow    
817.server                   job2      mahmood         02:13:20 R slow    

However I don't know why it says "cannot resume non-suspended job". As you can 
see, the sate of the job is changed from S to R.
 
// Naderan *Mahmood;




________________________________
From: Steve Johnson <steve at isc.tamu.edu>
To: skip at pobox.com
Cc: maui <mauiusers at supercluster.org>
Sent: Sat, January 15, 2011 9:16:11 PM
Subject: Re: [Mauiusers] SIGSTOP and SIGTSTP don't work

The mjobctl command is a scheduler command - it will send STOP/CONT 
signals to the job and Torque will know about it.  In effect, you're 
doing manual preemption.  With qhold/qrls, you're telling a running job 
to checkpoint and vacate the node (if supported) and enter the HELD 
state, or for a queued job to enter a HELD state and not be scheduled 
for execution until it's been qrls'd to the QUEUED state.

// Steve

On 01/15/2011 11:25 AM, skip at pobox.com wrote:
>
>      Steve>  I think what you're looking for is mjobctl.
>      Steve>  mjobctl -s jobID  # suspend
>      Steve>  mjobctl -r jobID  # resume
>
> Why is it that Torque and Maui seem to have overlapping commands?  What's
> the difference between the use of the mjobctl commands you referenced above
> and the qhold/qrls jobs of Torque?
>
_______________________________________________
mauiusers mailing list
mauiusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20110115/e956e44c/attachment.html 


More information about the mauiusers mailing list