[torquedev] nodes, procs, tpn and ncpus

Ken Nielson knielson at adaptivecomputing.com
Wed Jun 9 08:40:30 MDT 2010


On 06/09/2010 07:45 AM, Glen Beane wrote:
>> I am going to modify TORQUE so it will process these resources more like we expect.
>> >
>> >  procs=x will mean give me x processors anywhere.
>>      
> great
>
>    
>> >  nodes=x will mean the same as procs=x.
>>      
> I don't think this should be the case... Moab reinterprets it to mean
> the same thing, but historically with PBS that is not how has been
> interpreted.
>
>    
>> >  nodes=x:ppn=x will work as it currently does except that the value for nodes will not be ignored.
>>      
> what do you mean the value for nodes will not be ignored???  The value
> for nodes is NOT ignored now.
>
>
> gbeane at wulfgar:~>  echo "sleep 60" | qsub -l nodes=2:ppn=4,walltime=00:01:00
> 69792.wulfgar.jax.org
> gbeane at wulfgar:~>  qrun 69792
> gbeane at wulfgar:~>  qstat -f 69792
> ...
>      exec_host = cs-prod-2/3+cs-prod-2/2+cs-prod-2/1+cs-prod-2/0+cs-prod-1/3+cs
> 	-prod-1/2+cs-prod-1/1+cs-prod-1/0
> ...
>      Resource_List.neednodes = 2:ppn=4
>      Resource_List.nodect = 2
>      Resource_List.nodes = 2:ppn=4
>
>    
>
It seems you and Simon agree about how TORQUE is working. Following is 
what I have in qmgr.

#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_host_enable = True
set server acl_hosts = l18
set server acl_hosts += L18
set server acl_hosts += kmn
set server managers = ken at kmn
set server operators = ken at kmn
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server resources_available.nodect = 1024
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server log_level = 6
set server mom_job_sync = True
set server keep_completed = 30
set server next_job_number = 100

Whenever I do -l nodes=x:ppn=y where x is greater than 1 I still only 
get one node allocated to the job.

Ken



More information about the torquedev mailing list