[torqueusers] Weird queue behavior.

John Hanks griznog at engineering.usu.edu
Tue Apr 18 08:52:30 MDT 2006


I have two queues, parallel and dedicated. Parallel is supposed to catch
any job requesting less than nodes=32:ppn=4 and dedicated gets any job
larger than that. But I'm getting weird behavior like this:

# A nine node, 4 processor per node job works:
griznog at uinta ~ $ qsub -I -l nodes=9:ppn=4
qsub: waiting for job 10349.uinta.hpc.usu.edu to start

# A ten node, 4 ppn job doesn't.
griznog at uinta ~ $ qsub -I -l nodes=10:ppn=4
qsub: Job rejected by all possible destinations

# However, a 20 node 2 ppn job does
griznog at uinta ~ $ qsub -I -l nodes=20:ppn=2
qsub: waiting for job 10351.uinta.hpc.usu.edu to start

What am I doing wrong here that allows > ~36 CPU jobs unless I pack all
the processors on each node?

Queue configuration follows.



Qmgr: p q parallel
# Create queues and set their attributes.
# Create and define queue parallel
create queue parallel
set queue parallel queue_type = Execution
set queue parallel resources_max.nodect = 32
set queue parallel resources_max.nodes = 32:ppn=4
set queue parallel resources_max.walltime = 24:00:00
set queue parallel resources_min.nodect = 1
set queue parallel resources_min.nodes = 1:ppn=2
set queue parallel resources_default.nodes = 1:ppn=2
set queue parallel resources_default.walltime = 01:00:00
set queue parallel resources_available.nodect = 62
set queue parallel resources_available.nodes = 62:ppn=4
set queue parallel max_user_run = 8
set queue parallel enabled = True
set queue parallel started = True
Qmgr: p q dedicated
# Create queues and set their attributes.
# Create and define queue dedicated
create queue dedicated
set queue dedicated queue_type = Execution
set queue dedicated resources_max.nodect = 62
set queue dedicated resources_max.nodes = 62:ppn=4
set queue dedicated resources_max.walltime = 08:00:00
set queue dedicated resources_min.nodect = 33
set queue dedicated resources_min.nodes = 33:ppn=4
set queue dedicated resources_default.nodes = 33:ppn=4
set queue dedicated resources_default.walltime = 01:00:00
set queue dedicated resources_available.nodect = 62
set queue dedicated resources_available.nodes = 62
set queue dedicated enabled = True
set queue dedicated started = True

More information about the torqueusers mailing list