[torqueusers] Need help with NCPUS not working in QSUB

Jerry Smith jdsmit at sandia.gov
Thu Oct 6 14:49:54 MDT 2011


If you have built mpiexec:
http://www.osc.edu/~djohnson/mpiexec/index.php
It is aware of $PBS_NODEFILE, and will do "the right thing", when used 
similarly to mpirun as mentioned by Mr. Croyle.

Jerry

Coyle, James J [ITACD] wrote:
> Torque and PBS give you a file named 
> PBS_NODEFILE
>
> For example with MPIPCH you could use
>
> mpirun -np 28 -machinefile ${PBS_NODEFILE} ./prog
>
> Then 28 copies of ./prog will be started on 
> the 28 machines listed in  ${PBS_NODEFILE}
>
> Other programs like Fluent need you to specify something like:
> fluent 3ddp -t28 -pib -g -i Case.jou -cnf=${PBS_NODEFILE}
>
>
> again here you need to specify a file containing the
> machines on which to run each process.  If you leave off the
> -cnf above, fluent will start all the processes on
> the first node that the jobs got assigned to.
>  
>
> -----Original Message-----
>   
>> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
>> bounces at supercluster.org] On Behalf Of Lenox, Billy AMRDEC/Sentient
>> Corp.
>> Sent: Thursday, October 06, 2011 12:10 PM
>> To: Torque Users Mailing List
>> Subject: Re: [torqueusers] Need help with NCPUS not working in QSUB
>>
>> Ok I tried PBS -l procs=28 and it still runs on one NODE seed001
>> I notice that if I put in the script on the EXEC line the location
>> of a
>> HOSTFILE it runs and bypasses TORQUE PBS. I just have the Default
>> Scheduler
>> on the System. I know I can not specify PBS -l nodes=5.
>> I have tried different ways and still it will only run on ONE NODE
>> seed001.
>>
>> Billy
>>
>>     
>>> From: Troy Baer <tbaer at utk.edu>
>>> Organization: National Institute for Computational Sciences,
>>>       
>> University of
>>     
>>> Tennessee
>>> Reply-To: Torque Users Mailing List <torqueusers at supercluster.org>
>>> Date: Thu, 6 Oct 2011 12:07:45 -0400
>>> To: Torque Users Mailing List <torqueusers at supercluster.org>
>>> Subject: Re: [torqueusers] Need help with NCPUS not working in
>>>       
>> QSUB
>>     
>>> On Thu, 2011-10-06 at 09:55 -0500, Lenox, Billy AMRDEC/Sentient
>>>       
>> Corp.
>>     
>>> wrote:
>>>       
>>>> I have torque setup on a head node system with 5 compute nodes
>>>> Two have 8 cores and 3 have 4 cores setup into on queue called
>>>>         
>> batch
>>     
>>>> When I use a submit script
>>>>
>>>> #!/bin/bash
>>>> #PBS -l ncpus=28
>>>> #PBS -l walltime=72:00:00
>>>> #PBS -o output.out
>>>> #PBS -e ie.error
>>>>
>>>> Here /var/spool/torque/server_priv/nodes
>>>>
>>>> seed001 np=8 batch
>>>> seed002 np=8 batch
>>>> seed003 np=8 batch
>>>> seed004 np=8 batch
>>>> seed005 np=8 batch
>>>>
>>>> When I submit the script it only runs on one node SEED001
>>>>
>>>> I don't know why it only runs on one node.
>>>>         
>>> Which scheduler are you using?  In most of the TORQUE-compatible
>>> schedulers I've seen, the ncpus= resource is interpreted as how
>>>       
>> many
>>     
>>> processors you want on a single shared memory system.  (If you
>>>       
>> want X
>>     
>>> processors and you don't care where they are, I think the
>>>       
>> preferred way
>>     
>>> of requesting it is procs=X.)
>>>
>>> --Troy
>>> --
>>> Troy Baer, HPC System Administrator
>>> National Institute for Computational Sciences, University of
>>>       
>> Tennessee
>>     
>>> http://www.nics.tennessee.edu/
>>> Phone:  865-241-4233
>>>
>>>
>>> _______________________________________________
>>> torqueusers mailing list
>>> torqueusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>       
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>     
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20111006/e68cd9d9/attachment.html 


More information about the torqueusers mailing list