[torqueusers] Re-executing a qeueued job
dbeer at adaptivecomputing.com
Thu Dec 26 07:27:06 MST 2013
If you are using Moab or Maui then they will 'defer' jobs that aren't able
to run after a few retries. You probably need to do something like
to let the scheduler know its okay to retry job execution again. There is
also a parameter to control the amount of time that jobs stay deferred
before they are retried again - DEFERTIME. It defaults to 1 hour.
On Thu, Dec 26, 2013 at 7:18 AM, Mahmood Naderan <nt_mahmood at yahoo.com>wrote:
> I have submitted some jobs however at the time I submitted them, they were
> (and still are) in Q state with this reason
> Messages: cannot start job - RM failure, rc: 15046, msg: 'Resource
> temporarily unavailable MSG=job allocation request exceeds currently
> available cluster nodes, 1 requested, 0 available'
> How can I re-execute the job? Maybe the resource was not available at that
> time. I can not delete the jobs and resubmit them because a script has
> generated that.
> Any way to *retry* the queued job?
> torqueusers mailing list
> torqueusers at supercluster.org
David Beer | Senior Software Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers