[torqueusers] Problem with filtering jobs of more than 10
CPUS/node
Garrick Staples
garrick at clusterresources.com
Fri Apr 13 11:20:56 MDT 2007
On Fri, Apr 13, 2007 at 02:39:05PM +0200, Justin Finnerty alleged:
> Hello
>
> I have just added a 16cpu node to my queue system. The cluster had 4
> CPU nodes and I want to limit access to the 16cpu node to jobs needing
> more than 4 CPUs.
>
> qmgr setup for "vx" queue for 16 CPU node:
>
> set queue vx queue_type = Execution
> set queue vx Priority = 3
> set queue vx from_route_only = True
> set queue vx resources_max.nodect = 1
> set queue vx resources_max.nodes = 1:ppn=16
> set queue vx resources_max.walltime = 192:00:00
> set queue vx resources_min.nodes = 1:ppn=5
> set queue vx enabled = True
> set queue vx started = True
>
> This works fine for -l nodes=1:ppn=Y with Y = 5-9 (for example)
>
> #PBS -l walltime=2:0:0
> #PBS -l nodes=1:ppn=5
> #PBS -q vx
>
> however with Y > 9 and < 50 (for example)
>
> #PBS -l walltime=2:0:0
> #PBS -l nodes=1:ppn=15
> #PBS -q vx
>
> we get the same error message as with Y < 5
>
> qsub: Job exceeds queue resource limits MSG=cannot satisfy queue min
> nodes requirement
>
> and with Y > 50 (for example) we get
>
> qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
>
> It is as if only the first digit is being considered for the test to
> satisfy queue min requirements. But the real value is tested later.
> This sort of looks like a code bug but I am not sure.
>
> Noticed this on 2.1.1 and just downloaded and tested on torque version
> 2.1.8
min/max for "nodes" is pretty useless because it is a string, not a
number. min/max nodect works correctly.
I can't think of a way to effectively filter on min/max ppn.
More information about the torqueusers
mailing list