[torqueusers] ppn + hostlist

Matt Britt msbritt at umich.edu
Wed Dec 4 12:51:46 MST 2013


I don't know if it is related, but there is an issue w/ Moab asking for
multiple named nodes in node exclusive mode.  Interestingly, it is the
opposite problem from what Brian mentions - tasks are combined down to the
first named node.

#PBS -l nodes=node1+node2
#PBS -n

Checkjob would return something like:
Allocated Nodes:
[node1:2]

--------------------------------------------
Matthew Britt
CAEN HPC Group - College of Engineering
msbritt at umich.edu



On Wed, Dec 4, 2013 at 1:09 PM, Glen Beane <glen.beane at gmail.com> wrote:

> I'm guessing this is a Moab issue.  As far as I know, Torque by itself has
> never supported what you are trying to do  (and the fact that
> nodes=compute-3-1:ppn=2+compute-3-5:ppn=2+compute-7-1:ppn=2+compute-7-3:ppn=2+compute-3-7:ppn=2
> does not work indicates that Moab is doing doing something strange to the
> resource request)
>
>
>
> On Wed, Dec 4, 2013 at 12:38 PM, Andrus, Brian Contractor <
> bdandrus at nps.edu> wrote:
>
>>  Glen,
>>
>>
>>
>> Thanks for the clarity, however that still doesn’t work.
>>
>>
>>
>> *qsub -I -l
>> nodes=compute-3-1:ppn=2+compute-3-5:ppn=2+compute-7-1:ppn=2+compute-7-3:ppn=2+compute-3-7:ppn=2*
>>
>>
>>
>> The job shows up waiting to run and checkjob tells me:
>>
>>
>>
>> *Req[0]  TaskCount: 8  Partition: ALL*
>>
>> *Opsys: ---  Arch: ---  Features: compute-3-1*
>>
>> *Dedicated Resources Per Task: PROCS: 1  MEM: 1024M*
>>
>> *Required HostList:*
>>
>> *[compute-3-5:2][compute-7-1:2][compute-7-3:2][compute-3-7:2]*
>>
>>
>>
>> So the HostList is comprised of all the listed nodes EXCEPT the first
>> one, which gets tagged as a feature.
>>
>>
>>
>> Also that seems to require I specify exactly everything I need.
>>
>> The scenario I am working with is: I want 2 nodes with 2 ppn, but they
>> can come from any node(s) from a list of several.
>>
>> Using the ‘correct syntax’, I would end up with 2 procs on each node
>> listed.
>>
>>
>>
>> I see a similar issue with trying to use procs with a nodelist.
>>
>> Not sure if that is possible, but apparently users had been doing that
>> too:
>>
>>
>>
>> *qsub -I -l procs=32 -l
>> nodes=compute-3-1+compute-3-5+compute-7-1+compute-7-3+compute-3-7*
>>
>>
>>
>> This again creates a HostList of all but the first named node, which is
>> tagged as a feature requirement
>>
>>
>>
>> Seems like something is parsing stuff oddly.
>>
>>
>>
>> Brian Andrus
>>
>> ITACS/Research Computing
>>
>> Naval Postgraduate School
>>
>> Monterey, California
>>
>> voice: 831-656-6238
>>
>>
>>
>>
>>
>>
>>
>> *From:* torqueusers-bounces at supercluster.org [mailto:
>> torqueusers-bounces at supercluster.org] *On Behalf Of *glen.beane at gmail.com
>> *Sent:* Wednesday, December 04, 2013 6:03 AM
>> *To:* Torque Users Mailing List
>> *Subject:* Re: [torqueusers] ppn + hostlist
>>
>>
>>
>> The correct syntax has always been
>> nodes=node01:ppn=2+node02:ppn=2+mode03:ppn=2
>>
>> Sent from my iPhone
>>
>>
>> On Dec 4, 2013, at 2:12 AM, "Andrus, Brian Contractor" <bdandrus at nps.edu>
>> wrote:
>>
>>  All,
>>
>>
>>
>> Something seems to have changed either in torque or moab (I am thinking
>> moab).
>>
>>
>>
>> If I want to request 2 nodes with 2 ppn from a particular hostlist, we
>> used to:
>>
>> *qsub -l nodes=2:ppn=2:node01+node02+node03*
>>
>>
>>
>> But now that does not work. It errors with:
>>
>> *qsub: submit error (Job rejected by all possible destinations (check
>> syntax, queue resources, ...))*
>>
>>
>>
>> However I can:
>>
>> *qsub -l nodes=2:ppn=2 -l nodes=node01+node02+node03*
>>
>>
>>
>> But… that gives me 3 procs from those available in the nodelist
>> (basically it ignores the first “-l” directive)
>>
>>
>>
>> And if I try:
>>
>> *qsub -l nodes=1:node01+node02+node03*
>>
>>
>>
>> It ends up putting the node names as a required features, which of course
>> no single node has all of… o.O
>>
>> Such jobs end up never running until I force them with ‘qrun’
>>
>>
>>
>> So, how do I request  X nodes with Y PPN from resources constrained to a
>> particular list of hosts?
>>
>>
>>
>> The current use case is for users that want to ensure their job lands on
>> the same nodes as they to timing comparisons.
>>
>>
>>
>> This is torque 4.2.6 and moab 7.2.6
>>
>>
>>
>> Brian Andrus
>>
>> ITACS/Research Computing
>>
>> Naval Postgraduate School
>>
>> Monterey, California
>>
>> voice: 831-656-6238
>>
>>
>>
>>  _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20131204/e541610d/attachment-0001.html 


More information about the torqueusers mailing list