[torqueusers] procs resource advice and docs

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Wed Jan 20 18:30:38 MST 2010


Thanks Martin and Roman,

> We are using this together with the
> 
> JOBNODEMATCHPOLICY EXACTNODE
> 
> in moab. This way the -l nodes=n:ppn=m specification is interpreted the
> way users expect it to work and at the same time we have a method
> available
> to users who do not care how their processes are distributed accross
> nodes.
> After gaining some experience with this we now recommend to users to
> use -l procs=N unless they have a specific reason to use the -l
> nodes=n:ppn=m
> syntax: the waiting time in the queue with -l procs=N is much, much
> shorter.

Yes, we prefer JOBNODEMATCHPOLICY EXACTNODE too.

> 
> > Is there a downside apart from the lack of documentation?
> 
> None. Only benefits. The usage percentage of the cluster increases
> dramatically.
> 
> >  I see it's documented in the pbs_resources_unicos8 man page so I guess
> > it was developed by or for CRAY, possibly long ago, but it seems to work
> > fine on our linux systems. The pbs_resources_unicos8 man page does not
> > mention that it conflicts with the nodes resource syntax but I guess
> this
> > is obvious.
> 
> Actually, -l procs is fairly new - we requested that feature :-)
> Thus, I do not believe that whatever is mentioned under CRAY applies.

Who did the work - it would be good to see what they have to say about documenting the change?  The unicos8 man page just has the following line:
       procs     Maximum number of processes in the job.  Units: unitary
which might be enough, except perhaps to note the relationship or conflict with the nodes resource. 

> >  moab just uses the nodes resource request if it is present (ignoring
> > the procs resource)
> 
> Actually it is the other way round: if you specify procs then nodes are
> ignored, see:
> 
> http://www.clusterresources.com/products/mwm/docs/13.3rmextensions.shtml#p
> rocs

Interesting all round :-).  I misinterpreted my admittedly not-extensive tests.  I can now confirm your assertion but also see that procs jobs are not packing into nodes on at least one of our clusters - the procs are being partly distributed one per node and partly paired up.  Strange but not critical as we can use the nodes syntax when more control is needed.

> 
> (I am confused here: that page existed a few days ago, but now it is
> gone).
> 
> > and I guess maui would too but that is a function of the scheduler
> > and doesn't really have a place in the torque docs per-se.
> 
> I am not sure whether maui has support for procs.
> 
> > Can the linux pbs_resources man page should be updated to include
> > the procs resource?
> 
> There is one more issue: if you are using OSC's mpiexec you need
> to patch the code in get_hosts.c to add support for procs. I can
> email you the patch, if you need it.

We're using openmpi and the torque integration is working just fine, but thanks for the offer. (also SGI's MPT/MPI)

Much appreciated,

Gareth

> 
> Cheers,
> Martin
> 
> --
> Martin Siegert
> Head, Research Computing
> WestGrid Site Lead
> IT Services                                phone: 778 782-4691
> Simon Fraser University                    fax:   778 782-4242
> Burnaby, British Columbia                  email: siegert at sfu.ca
> Canada  V5A 1S6


More information about the torqueusers mailing list