[torquedev] nodes, procs, tpn and ncpus

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Wed Jun 9 08:43:22 MDT 2010


Dne 9.6.2010 16:40, Ken Nielson napsal(a):
> On 06/09/2010 07:45 AM, Glen Beane wrote:
>>> I am going to modify TORQUE so it will process these resources more like we expect.
>>>>
>>>>  procs=x will mean give me x processors anywhere.
>>>      
>> great
>>
>>    
>>>>  nodes=x will mean the same as procs=x.
>>>      
>> I don't think this should be the case... Moab reinterprets it to mean
>> the same thing, but historically with PBS that is not how has been
>> interpreted.
>>
>>    
>>>>  nodes=x:ppn=x will work as it currently does except that the value for nodes will not be ignored.
>>>      
>> what do you mean the value for nodes will not be ignored???  The value
>> for nodes is NOT ignored now.
>>
>>
>> gbeane at wulfgar:~>  echo "sleep 60" | qsub -l nodes=2:ppn=4,walltime=00:01:00
>> 69792.wulfgar.jax.org
>> gbeane at wulfgar:~>  qrun 69792
>> gbeane at wulfgar:~>  qstat -f 69792
>> ...
>>      exec_host = cs-prod-2/3+cs-prod-2/2+cs-prod-2/1+cs-prod-2/0+cs-prod-1/3+cs
>> 	-prod-1/2+cs-prod-1/1+cs-prod-1/0
>> ...
>>      Resource_List.neednodes = 2:ppn=4
>>      Resource_List.nodect = 2
>>      Resource_List.nodes = 2:ppn=4
>>
>>    
>>
> It seems you and Simon agree about how TORQUE is working. Following is 
> what I have in qmgr.
> 
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue batch
> #
> create queue batch
> set queue batch queue_type = Execution
> set queue batch resources_default.nodes = 1
> set queue batch resources_default.walltime = 01:00:00
> set queue batch enabled = True
> set queue batch started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = True
> set server acl_hosts = l18
> set server acl_hosts += L18
> set server acl_hosts += kmn
> set server managers = ken at kmn
> set server operators = ken at kmn
> set server default_queue = batch
> set server log_events = 511
> set server mail_from = adm
> set server resources_available.nodect = 1024
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server log_level = 6
> set server mom_job_sync = True
> set server keep_completed = 30
> set server next_job_number = 100
> 
> Whenever I do -l nodes=x:ppn=y where x is greater than 1 I still only 
> get one node allocated to the job.

Well, what scheduler are you using? Schedulers can completely mask the
original nodespec. They can send their own nodespec in the run request.

-- 
Mgr. Šimon Tóth

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100609/8985d902/attachment.bin 


More information about the torquedev mailing list