[torqueusers] qsub and resource limits problem.
siegert at sfu.ca
Tue Nov 5 17:13:28 MST 2013
On Tue, Nov 05, 2013 at 11:01:43AM -0200, Daniel Lopes de Carvalho wrote:
> Hello Guys,
> Last week I configured a TORQUE's queue with the following
> create queue longas
> set queue longas queue_type = Execution
> set queue longas Priority = 10000
> set queue longas resources_max.nodes = 1:ppn=12
> set queue longas resources_default.nodes = 1:ppn=12
> set queue longas max_user_run = 4
> set queue longas enabled = True
> set queue longas started = True
> And after this, the TORQUE is not working properly when I submit a job
> to this queue.
> If I use the command: 'echo "sleep 300" | qsub-q long-l nodes = 1: ppn
> = 8' to submit a job, the following message appears:
> qsub: Job exceeds queue resource limits MSG = can not satisfy queue max
> nodes requirement
> However, if I use the same command, but adding a 0 in front of 8, the
> submission normally happens: 'echo "sleep 300" | qsub-q long-l nodes =
> 1: ppn = 08'
> Is there a possibility to fix this and make the TORQUE accepts the
> first line ('echo "sleep 300" | qsub-q long-l nodes = 1: ppn = 8')?
> Thanks and best regards
The nodes resource is tricky. What is larger: 1:ppn=12 or 2:ppn=5, or ...?
As far as I remember the nodes resource is stored as a string and the
a lexical string comparison is used as metric. As a consequence
1:ppn=12 is actually smaller than 1:ppn=8, whereas 1:ppn=12 is larger
than 1:ppn=08. Basically resources_max.nodes and resources_min.nodes
should not be used at all - the results are almost unpredictable.
There are two other resources that are derived from the nodes
resource: nodect and procct. I believe that you can accomplish
what you want with setting:
set queue longas resources_max.nodect = 1
set queue longas resources_max.procct = 12
nodect counts the number of nodes allocated to the job whereas
procct counts the number of cores allocated to a job, i.e.,
for a specification nodes=n:ppn=m, nodect=m and procct=m*n.
For jobs that are submitted with -l procs=x instead of -l nodes=...
procct is set to x.
WestGrid/ComputeCanada Site Lead
Simon Fraser University
Burnaby, British Columbia, Canada
More information about the torqueusers