[torqueusers] jobs with nodes=x x>1 are packing on single node

Jerry Smith jdsmit at sandia.gov
Thu Apr 12 15:38:05 MDT 2007



>> Hi,
>>    In a nutshell, I'm trying to configure Torque(2.1.6)/Maui(3.2.6p19)
>> so that jobs will be spread over multiple nodes instead of bunched up on
>> a single node.
>> 
>> Currently when I submit
>> 
>> echo "cat $PBS_NODESFILE" | qsub -l nodes=4
>> 
>> I get back something like this
>> 
>> n31
>> n31
>> n31
>> n31
>> 
>> I've tried a variety of options in my maui.cfg file, such as setting the
>> BACKFILLPOLICY to NONE (no effect) and NODEALLOCATIONPOLICY to
>> MAXBALANCE (which caused jobs requesting more than one node/cpu to
>> hang), but honestly I'm not even sure that was the right place to start
>> after reading some of the previous posts around this same topic.
>> 
>> Any ideas on what needs to be tweaked?
> 
> I believe you want to request nodes=4:ppn=1 and set JOBNODEMATCHPOLICY to
> EXACTNODE in your maui.cfg.
> 
>From Garrick's response:
>>Does your server's nodes file have np=X for each node?

Make sure you incorporate that into your qmgr queue setup as well.

If your node file looks like

node01 np=1 # where np = number of processors on each node
node01 np=2

In your queue setup have a default.nodes line similar to:

set queue batch resources_default.nodes = 1:ppn=1

And as an aside, if you want to prevent multiple jobs on a single node using
maui:

NODEACCESSPOLICY        SINGLEJOB

Jerry Smith
----------------------------------
Sandia National Labs
jdsmit at sandia.gov





More information about the torqueusers mailing list