[torqueusers] jobs with nodes=x x>1 are packing on single
node
Jerry Smith
jdsmit at sandia.gov
Thu Apr 12 15:38:05 MDT 2007
>> Hi,
>> In a nutshell, I'm trying to configure Torque(2.1.6)/Maui(3.2.6p19)
>> so that jobs will be spread over multiple nodes instead of bunched up on
>> a single node.
>>
>> Currently when I submit
>>
>> echo "cat $PBS_NODESFILE" | qsub -l nodes=4
>>
>> I get back something like this
>>
>> n31
>> n31
>> n31
>> n31
>>
>> I've tried a variety of options in my maui.cfg file, such as setting the
>> BACKFILLPOLICY to NONE (no effect) and NODEALLOCATIONPOLICY to
>> MAXBALANCE (which caused jobs requesting more than one node/cpu to
>> hang), but honestly I'm not even sure that was the right place to start
>> after reading some of the previous posts around this same topic.
>>
>> Any ideas on what needs to be tweaked?
>
> I believe you want to request nodes=4:ppn=1 and set JOBNODEMATCHPOLICY to
> EXACTNODE in your maui.cfg.
>
>From Garrick's response:
>>Does your server's nodes file have np=X for each node?
Make sure you incorporate that into your qmgr queue setup as well.
If your node file looks like
node01 np=1 # where np = number of processors on each node
node01 np=2
In your queue setup have a default.nodes line similar to:
set queue batch resources_default.nodes = 1:ppn=1
And as an aside, if you want to prevent multiple jobs on a single node using
maui:
NODEACCESSPOLICY SINGLEJOB
Jerry Smith
----------------------------------
Sandia National Labs
jdsmit at sandia.gov
More information about the torqueusers
mailing list