[torqueusers] Re: [torquedev] Some jobs not starting with Torque 2.3.1 and Moab

Josh Butikofer josh at clusterresources.com
Fri Jul 18 08:07:39 MDT 2008


Lennart,

This was fixed in the latest TORQUE 2.3.2 snapshot available at 
http://www.clusterresources.com/downloads/torque/torque-2.3.2-snap.200807092141.tar.gz.

Alternatively, if you are using Moab 5.2.3 revision 9927 or higher, you 
can set the "NONEEDNODES=TRUE" parameter on your RMCFG[] line that 
describes your TORQUE resource manager:

Ex: moab.cfg

RMCFG[base] TYPE=PBS NONEEDNODES=TRUE

--Josh Butikofer

Lennart Karlsson wrote:
> Chris Samuel wrote the 5th of July:
>> I'm not sure if this is a Torque or Moab bug or just the result
>> of a change in interaction between the two, so I'm report this
>> to both. :-)
>>
>> Torque 2.3.1 official release.
>>
>> # moab --version
>> moab server version 5.2.3 (revision 10590)
>>
>> We have a number of jobs that are not starting and are ending
>> up in BatchHold due to repeated failures.  They are all logging
>> similar information:
>>
>> Message[30] cannot start job on reserved resources - job cannot be started on RM base - cannot set hostlist: cannot set job '472817.tango-m.vpac.org' attr 'Resource_List:neednodes' to 'tango048' - job may have been removed externally (rc: 15001 'Unknown Job Id')
> 
> 
> Hi,
> 
> We installed 2.3.1 of Torque today, run version 5.2.3.s10693 of Moab,
> and get the same problem with some jobs.
> 
> Where there ever some solution to this problem?
> 
> Best regards,
> -- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
>    National Supercomputer Centre in Linkoping, Sweden
>    http://www.nsc.liu.se
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torquedev mailing list