[torqueusers] how -l procs works

Ken Nielson knielson at adaptivecomputing.com
Wed Jun 2 09:48:57 MDT 2010


On 06/02/2010 09:40 AM, Glen Beane wrote:
> On Wed, Jun 2, 2010 at 11:33 AM, Glen Beane<glen.beane at gmail.com>  wrote:
>    
>> On Wed, Jun 2, 2010 at 11:04 AM, Ken Nielson
>> <knielson at adaptivecomputing.com>  wrote:
>>      
>>> Hi all,
>>>
>>> On another thread with the subject "qsub on several nodes" it was
>>> suggested the procs is a better solution to scattering jobs across all
>>> available processors than nodes.  However, I find the procs resource
>>> does not seem to behave the way  described in the thread.
>>>
>>> For instance if I do the following:
>>>
>>> qsub -l procs=5<job.sh>
>>>
>>> The qstat output shows the following resource list
>>>
>>>   Resource_List.neednodes = 1
>>>   Resource_List.nodect = 1
>>>   Resource_List.nodes = 1
>>>   Resource_List.procs = 5
>>>
>>> If I do a qrun on this job it will be assigned a single node and one
>>> processor.
>>>
>>> The qstat -f after the job is started gives and exec_host of node/0.
>>> TORQUE ignores the procs keyword and assigns the default of 1 node and
>>> one processor to the job.
>>>
>>> Moab interprets procs to mean number of processors requested on a single
>>> node for the job. If I let Moab Schedule the job the exec_host from
>>> qstat is node/0+node/1+node/2+node/3+node/4.
>>>
>>> If I make the value of procs greater than the number of processors on
>>> any node moab will not run the job.
>>>
>>> Ken
>>>        
>> as far as I know, moab looks at -l proc=X, interprets it, and then
>> sets the exec_host to some set of nodes that satisfies the request.
>> It is a hack and defiantly won't work with qrun, since it requires
>> that the exec_host list is set. The fact that it is basically ignored
>> by torque is my major complaint with how it is implemented.
>>
>> I use it all the time to request more processors than will run on a
>> single node.  For example, I routinely use -l procs=32 or more on a
>> cluster of 4 core nodes.  I'm using Moab 5.4.0, but I know I've used
>> it on some recent previous versions.
>>      
>
> gbeane at wulfgar:~>  echo "pbsdsh hostname" | qsub -N procs_test -l
> procs=64,walltime=00:01:00
> 69641.wulfgar.jax.org
> gbeane at wulfgar:~>  cat procs_test.o69641 | sort | uniq | wc -l
> 17
>
> it took 17 unique nodes to satisfy my procs=64 request.  Some nodes I
> was allocated 4 cores, others I was allocated some subset of the total
> number of cores because others were in use.
>
> another example
>
> gbeane at wulfgar:~>  qsub -l procs=64,walltime=00:05:00 -I
> qsub: waiting for job 69642.wulfgar.jax.org to start
> qsub: job 69642.wulfgar.jax.org ready
>
> Have a lot of fun...
> Directory: /home/gbeane
> Wed Jun  2 11:38:05 EDT 2010
> gbeane at cs-short-2:~>  cat $PBS_NODEFILE | wc -l
> 64
> gbeane at cs-short-2:~>  cat $PBS_NODEFILE | uniq | wc -l
> 17
> _______________________________________________
>    
So, is the idea to let Moab create the node spec and then use qrun to 
execute the job?

Ken


More information about the torqueusers mailing list