[torquedev] nodes, procs, tpn and ncpus
knielson at adaptivecomputing.com
Wed Jun 9 08:40:30 MDT 2010
On 06/09/2010 07:45 AM, Glen Beane wrote:
>> I am going to modify TORQUE so it will process these resources more like we expect.
>> > procs=x will mean give me x processors anywhere.
>> > nodes=x will mean the same as procs=x.
> I don't think this should be the case... Moab reinterprets it to mean
> the same thing, but historically with PBS that is not how has been
>> > nodes=x:ppn=x will work as it currently does except that the value for nodes will not be ignored.
> what do you mean the value for nodes will not be ignored??? The value
> for nodes is NOT ignored now.
> gbeane at wulfgar:~> echo "sleep 60" | qsub -l nodes=2:ppn=4,walltime=00:01:00
> gbeane at wulfgar:~> qrun 69792
> gbeane at wulfgar:~> qstat -f 69792
> exec_host = cs-prod-2/3+cs-prod-2/2+cs-prod-2/1+cs-prod-2/0+cs-prod-1/3+cs
> Resource_List.neednodes = 2:ppn=4
> Resource_List.nodect = 2
> Resource_List.nodes = 2:ppn=4
It seems you and Simon agree about how TORQUE is working. Following is
what I have in qmgr.
# Create queues and set their attributes.
# Create and define queue batch
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
# Set server attributes.
set server scheduling = True
set server acl_host_enable = True
set server acl_hosts = l18
set server acl_hosts += L18
set server acl_hosts += kmn
set server managers = ken at kmn
set server operators = ken at kmn
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server resources_available.nodect = 1024
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server log_level = 6
set server mom_job_sync = True
set server keep_completed = 30
set server next_job_number = 100
Whenever I do -l nodes=x:ppn=y where x is greater than 1 I still only
get one node allocated to the job.
More information about the torquedev