[torqueusers] Problem with ppn and routing

Ken Nielson knielson at adaptivecomputing.com
Mon Nov 29 19:15:19 MST 2010

----- Original Message -----
From: "J.A. Magallón" <jamagallon at ono.com>
To: torqueusers at supercluster.org
Sent: Monday, November 29, 2010 6:16:56 PM
Subject: [torqueusers] Problem with ppn and routing

Hi all...

(I'm new to the list, so hello to everyone...)

I have a test system with a front-end and 2 nodes just to play and learn with
torque/mpi. My setup is very standard:

# Create and define queue fast
create queue fast
set queue fast queue_type = execution
set queue fast priority = 80
set queue fast max_running = 10
set queue fast resources_min.walltime = 00:00:00
set queue fast resources_max.walltime = 01:00:00
set queue fast resources_max.nodes = 2:ppn=2
set queue fast resources_default.walltime = 01:00:00
set queue fast enabled = true
set queue fast started = true
# Create and define queue batch
create queue batch
set queue batch queue_type = execution
set queue batch priority = 20
set queue batch max_running = 10
set queue batch resources_min.walltime = 01:00:00
set queue batch resources_max.walltime = 48:00:00
set queue batch resources_max.nodes = 1:ppn=1
set queue batch resources_default.walltime = 48:00:00
set queue batch enabled = true
set queue batch started = true
# Create and define queue default
create queue default
set queue default queue_type = route
set queue default route_destinations = batch
set queue default route_destinations += fast
set queue default enabled = true
set queue default started = true

My idea is the typical 'long jobs can only use one processor'.

With torque 2.4.18, a submission like qsub -l nodes=2:ppn=2 sends the job
to the fast queue, and I know it is limited to 1 hr walltime.
If I need more time:

annwn:~/dev/mpi/tst> qsub -l nodes=2:ppn=2,walltime=10:00:00 k
qsub: Job rejected by all possible destinations

torque tells me I can not use both boxes. 
Problem 1: to fit into queue 'batch' I have just to lower nodes, I can
still leave ppn=2. Is this supposed to work that way ? I thought it
will force me to lower ppn also...

If I upgrade to 2.5.3...
Problem 2: even the simple job fits in any queue:

annwn:~/dev/mpi/tst> qsub -l nodes=2:ppn=2 k
qsub: Job rejected by all possible destinations

I expected to behave like 2.4, it will put the job in 'fast' queue, and
limit it to 1 hr walltime. Even if I ask for few walltime:

annwn:~/dev/mpi/tst> qsub -l nodes=2:ppn=2,walltime=00:10:00 k
qsub: Job rejected by all possible destinations

Any ideas ? What am I doing wrong ?


J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free

what is the contents of your nodes file. What did you set np to for this node.

Ken Nielson
Adaptive Computing

More information about the torqueusers mailing list