[torqueusers] procs resource advice and docs
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Wed Jan 20 18:30:38 MST 2010
Thanks Martin and Roman,
> We are using this together with the
>
> JOBNODEMATCHPOLICY EXACTNODE
>
> in moab. This way the -l nodes=n:ppn=m specification is interpreted the
> way users expect it to work and at the same time we have a method
> available
> to users who do not care how their processes are distributed accross
> nodes.
> After gaining some experience with this we now recommend to users to
> use -l procs=N unless they have a specific reason to use the -l
> nodes=n:ppn=m
> syntax: the waiting time in the queue with -l procs=N is much, much
> shorter.
Yes, we prefer JOBNODEMATCHPOLICY EXACTNODE too.
>
> > Is there a downside apart from the lack of documentation?
>
> None. Only benefits. The usage percentage of the cluster increases
> dramatically.
>
> > I see it's documented in the pbs_resources_unicos8 man page so I guess
> > it was developed by or for CRAY, possibly long ago, but it seems to work
> > fine on our linux systems. The pbs_resources_unicos8 man page does not
> > mention that it conflicts with the nodes resource syntax but I guess
> this
> > is obvious.
>
> Actually, -l procs is fairly new - we requested that feature :-)
> Thus, I do not believe that whatever is mentioned under CRAY applies.
Who did the work - it would be good to see what they have to say about documenting the change? The unicos8 man page just has the following line:
procs Maximum number of processes in the job. Units: unitary
which might be enough, except perhaps to note the relationship or conflict with the nodes resource.
> > moab just uses the nodes resource request if it is present (ignoring
> > the procs resource)
>
> Actually it is the other way round: if you specify procs then nodes are
> ignored, see:
>
> http://www.clusterresources.com/products/mwm/docs/13.3rmextensions.shtml#p
> rocs
Interesting all round :-). I misinterpreted my admittedly not-extensive tests. I can now confirm your assertion but also see that procs jobs are not packing into nodes on at least one of our clusters - the procs are being partly distributed one per node and partly paired up. Strange but not critical as we can use the nodes syntax when more control is needed.
>
> (I am confused here: that page existed a few days ago, but now it is
> gone).
>
> > and I guess maui would too but that is a function of the scheduler
> > and doesn't really have a place in the torque docs per-se.
>
> I am not sure whether maui has support for procs.
>
> > Can the linux pbs_resources man page should be updated to include
> > the procs resource?
>
> There is one more issue: if you are using OSC's mpiexec you need
> to patch the code in get_hosts.c to add support for procs. I can
> email you the patch, if you need it.
We're using openmpi and the torque integration is working just fine, but thanks for the offer. (also SGI's MPT/MPI)
Much appreciated,
Gareth
>
> Cheers,
> Martin
>
> --
> Martin Siegert
> Head, Research Computing
> WestGrid Site Lead
> IT Services phone: 778 782-4691
> Simon Fraser University fax: 778 782-4242
> Burnaby, British Columbia email: siegert at sfu.ca
> Canada V5A 1S6
More information about the torqueusers
mailing list