[torqueusers] qsub and resource limits problem.

Martin Siegert siegert at sfu.ca
Tue Nov 5 17:18:52 MST 2013


Sorry there is a typo in my email below ...

On Tue, Nov 05, 2013 at 04:13:28PM -0800, Martin Siegert wrote:
> Hi Daniel,
> 
> On Tue, Nov 05, 2013 at 11:01:43AM -0200, Daniel Lopes de Carvalho wrote:
> > 
> >    Hello Guys,
> >    Last week I configured a TORQUE's queue with the following
> >    characteristics:
> >    create queue longas
> >    set queue longas queue_type = Execution
> >    set queue longas Priority = 10000
> >    set queue longas resources_max.nodes = 1:ppn=12
> >    set queue longas resources_default.nodes = 1:ppn=12
> >    set queue longas max_user_run = 4
> >    set queue longas enabled = True
> >    set queue longas started = True
> >    And after this, the TORQUE is not working properly when I submit a job
> >    to this queue.
> >    If I use the command: 'echo "sleep 300" | qsub-q long-l nodes = 1: ppn
> >    = 8' to submit a job, the following message appears:
> >    qsub: Job exceeds queue resource limits MSG = can not satisfy queue max
> >    nodes requirement
> >    However, if I use the same command, but adding a 0 in front of 8, the
> >    submission normally happens: 'echo "sleep 300" | qsub-q long-l nodes =
> >    1: ppn = 08'
> >    Is there a possibility to fix this and make the TORQUE accepts the
> >    first line ('echo "sleep 300" | qsub-q long-l nodes = 1: ppn = 8')?
> >    Thanks and best regards
> 
> The nodes resource is tricky. What is larger: 1:ppn=12 or 2:ppn=5, or ...?
> As far as I remember the nodes resource is stored as a string and the
> a lexical string comparison is used as metric. As a consequence
> 1:ppn=12 is actually smaller than 1:ppn=8, whereas 1:ppn=12 is larger
> than 1:ppn=08. Basically resources_max.nodes and resources_min.nodes
> should not be used at all - the results are almost unpredictable.
> 
> There are two other resources that are derived from the nodes
> resource: nodect and procct. I believe that you can accomplish
> what you want with setting:
> 
> set queue longas resources_max.nodect = 1
> set queue longas resources_max.procct = 12
> 
> nodect counts the number of nodes allocated to the job whereas
> procct counts the number of cores allocated to a job, i.e.,
> for a specification nodes=n:ppn=m, nodect=m and procct=m*n.

This should be:

for a specification nodes=n:ppn=m, nodect=n and procct=m*n.

- Martin

> For jobs that are submitted with -l procs=x instead of -l nodes=...
> procct is set to x.
> 
> Cheers,
> Martin
> 
> -- 
> Martin Siegert
> WestGrid/ComputeCanada Site Lead
> IT Services
> Simon Fraser University
> Burnaby, British Columbia, Canada
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list