[torqueusers] Removing the "exec_host" attribute from a queued job ?

Wolfgang Wander wwc at rentec.com
Tue Sep 20 04:36:14 MDT 2005


Simon Robbins writes:
 > 
 > Hello,
 > 
 > On Tue, 20 Sep 2005, Chris Samuel wrote:
 > 
 > > Hi folks,
 > > 
 > > I've got a job that's queued and obviously tried to start and failed and has 
 > > ended up with the following attribute set on it:
 > > 
 > >    exec_host = edda010/0+edda007/3+edda007/2+edda007/1
 > > 
 > > I suspect it's stopping Moab or Torque from running it again on other nodes, 
 > > and I'd like to clear that attribute, but it doesn't appear to be accessible 
 > > through qalter or qmgr.
 > > 
 > > Any clues ?
 > 
 > Unfortunately no.  I have been seeing this behaviour 
 > for months now with torque_1.2.0p2,4,5 and 6.  From 
 > Maui I get:
 > HostList:
 >   [n504:1]
 > Messages:  cannot start job - RM failure, rc: 15041, msg: 
 > 'Execution server rejected request MSG=send failed, STARTING'
 > 
 > Sometimes this is associated with a failure in the 
 > network.
 > 

I've noticed that you can qrun -H [free-node] jobid the job.
You'll have to find a [free-node] manually though to make this
work...

           Wolfgang



More information about the torqueusers mailing list