[torqueusers] Torque
David Singleton
David.Singleton at anu.edu.au
Mon Jul 21 01:59:07 MDT 2008
Isn't this the problem of rerunnable (qsub -r y) being the default.
It's safer, and probably most people would prefer, having
qsub -r n as the default.
David
Chris Samuel wrote:
> ----- "Seb" <sebast2600 at yahoo.fr> wrote:
>
>> Hi,
>
> Hello Seb,
>
>> These last days we had many stroms and power outages, and each time
>> that our computers were restarted Torque automatically re-ran the
>> jobs.
>
> Sounds like a cluster configuration problem rather than
> a Torque problem - my guess is that someone has created
> an init script that blindly starts Torque on boot.
>
> That's a bad idea, as you've now found out.
>
> We use a different idea in ours, it checks to see
> if a file that only gets created on a clean shutdown
> exists and if so then it removes it and then starts
> the pbs_mom.
>
> If it doesn't then it just bails out as obviously
> the node died badly.
>
> cheers,
> Chris
More information about the torqueusers
mailing list