[torqueusers] ppn + hostlist

Andrus, Brian Contractor bdandrus at nps.edu
Wed Dec 4 10:38:34 MST 2013


Thanks for the clarity, however that still doesn’t work.

qsub -I -l nodes=compute-3-1:ppn=2+compute-3-5:ppn=2+compute-7-1:ppn=2+compute-7-3:ppn=2+compute-3-7:ppn=2

The job shows up waiting to run and checkjob tells me:

Req[0]  TaskCount: 8  Partition: ALL
Opsys: ---  Arch: ---  Features: compute-3-1
Dedicated Resources Per Task: PROCS: 1  MEM: 1024M
Required HostList:

So the HostList is comprised of all the listed nodes EXCEPT the first one, which gets tagged as a feature.

Also that seems to require I specify exactly everything I need.
The scenario I am working with is: I want 2 nodes with 2 ppn, but they can come from any node(s) from a list of several.
Using the ‘correct syntax’, I would end up with 2 procs on each node listed.

I see a similar issue with trying to use procs with a nodelist.
Not sure if that is possible, but apparently users had been doing that too:

qsub -I -l procs=32 -l nodes=compute-3-1+compute-3-5+compute-7-1+compute-7-3+compute-3-7

This again creates a HostList of all but the first named node, which is tagged as a feature requirement

Seems like something is parsing stuff oddly.

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238

From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of glen.beane at gmail.com
Sent: Wednesday, December 04, 2013 6:03 AM
To: Torque Users Mailing List
Subject: Re: [torqueusers] ppn + hostlist

The correct syntax has always been nodes=node01:ppn=2+node02:ppn=2+mode03:ppn=2

Sent from my iPhone

On Dec 4, 2013, at 2:12 AM, "Andrus, Brian Contractor" <bdandrus at nps.edu<mailto:bdandrus at nps.edu>> wrote:

Something seems to have changed either in torque or moab (I am thinking moab).

If I want to request 2 nodes with 2 ppn from a particular hostlist, we used to:
qsub -l nodes=2:ppn=2:node01+node02+node03

But now that does not work. It errors with:
qsub: submit error (Job rejected by all possible destinations (check syntax, queue resources, ...))

However I can:
qsub -l nodes=2:ppn=2 -l nodes=node01+node02+node03

But… that gives me 3 procs from those available in the nodelist (basically it ignores the first “-l” directive)

And if I try:
qsub -l nodes=1:node01+node02+node03

It ends up putting the node names as a required features, which of course no single node has all of… o.O
Such jobs end up never running until I force them with ‘qrun’

So, how do I request  X nodes with Y PPN from resources constrained to a particular list of hosts?

The current use case is for users that want to ensure their job lands on the same nodes as they to timing comparisons.

This is torque 4.2.6 and moab 7.2.6

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238

torqueusers mailing list
torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20131204/3987f6c1/attachment.html 

More information about the torqueusers mailing list