[torqueusers] Preventing jobs to be re-runed when

Troy Baer troy at osc.edu
Mon Mar 13 13:24:21 MST 2006


On Mon, 2006-03-13 at 14:16 -0600, David McGiven wrote:
> I was running a job in one of my cluster nodes. Due to an electrical
> problem the node was suddenly and unexpectedly rebooted.
> 
> While it was rebooting, the job was marked with an "E" when issuing qstat
> command. One minute after or so, when the node came back to normal
> operation, the job was "R" again. The system had automatically started the
> job again.
> 
> How can I prevent this from happening?
> 
> It's very dangerous because not all the jobs are meant to be resumed
> "automatically" and they might overwritte the already processed data.

In TORQUE and other PBS variants, jobs default to being rerunnable.
Jobs that are not rerunnable need to declare themselves as such, using
the -r flag to qsub:

#PBS -r n

See the qsub man page for more information.

	--Troy
-- 
Troy Baer                       troy at osc.edu
Science & Technology Support    http://www.osc.edu/hpc/
Ohio Supercomputer Center       614-292-9701



More information about the torqueusers mailing list