[torqueusers] Problem with filtering jobs of more than 10 CPUS/node

Garrick Staples garrick at clusterresources.com
Fri Apr 13 11:20:56 MDT 2007


On Fri, Apr 13, 2007 at 02:39:05PM +0200, Justin Finnerty alleged:
> Hello
> 
> I have just added a 16cpu node to my queue system.  The cluster had 4
> CPU nodes and I want to limit access to the 16cpu node to jobs needing
> more than 4 CPUs.
> 
> qmgr setup for "vx" queue for 16 CPU node:
> 
> set queue vx queue_type = Execution
> set queue vx Priority = 3
> set queue vx from_route_only = True
> set queue vx resources_max.nodect = 1
> set queue vx resources_max.nodes = 1:ppn=16
> set queue vx resources_max.walltime = 192:00:00
> set queue vx resources_min.nodes = 1:ppn=5
> set queue vx enabled = True
> set queue vx started = True
> 
> This works fine for -l nodes=1:ppn=Y with Y = 5-9 (for example)
> 
> #PBS -l walltime=2:0:0
> #PBS -l nodes=1:ppn=5
> #PBS -q vx
> 
> however with Y > 9 and < 50 (for example)
> 
> #PBS -l walltime=2:0:0
> #PBS -l nodes=1:ppn=15
> #PBS -q vx
> 
> we get the same error message as with Y < 5
> 
> qsub: Job exceeds queue resource limits MSG=cannot satisfy queue min
> nodes requirement
> 
> and with Y > 50 (for example) we get
> 
> qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
> 
> It is as if only the first digit is being considered for the test to
> satisfy queue min requirements.  But the real value is tested later.
> This sort of looks like a code bug but I am not sure.
> 
> Noticed this on 2.1.1 and just downloaded and tested on torque version
> 2.1.8

min/max for "nodes" is pretty useless because it is a string, not a
number.  min/max nodect works correctly.

I can't think of a way to effectively filter on min/max ppn.




More information about the torqueusers mailing list