[torqueusers] Preventing jobs to be re-runed when

Garrick Staples garrick at usc.edu
Mon Mar 13 13:21:03 MST 2006


On Mon, Mar 13, 2006 at 02:16:13PM -0600, David McGiven alleged:
> 
> Dear TORQUE users,
> 
> I was running a job in one of my cluster nodes. Due to an electrical
> problem the node was suddenly and unexpectedly rebooted.
> 
> While it was rebooting, the job was marked with an "E" when issuing qstat
> command. One minute after or so, when the node came back to normal
> operation, the job was "R" again. The system had automatically started the
> job again.
> 
> How can I prevent this from happening?
> 
> It's very dangerous because not all the jobs are meant to be resumed
> "automatically" and they might overwritte the already processed data.

man qsub

       -r y|n  Declares whether the job is rerunable.  See the qrerun command.
               The option argument is a single character, either y or n.

               If  the argument is "y", the job is rerunable.  If the argument
               is "n", the job is not rerunable.  The default  value  is 'y',
               rerunable.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060313/3146be6e/attachment.bin


More information about the torqueusers mailing list