[torqueusers] Problems upgrading from 2.4 to 2.5

J.A. Magallón jamagallon at ono.com
Mon Nov 29 08:07:05 MST 2010

Hi all...

First of all, hi to everyone, I'm new to the list.
I usually have solved my problems with torque with some googling, but this
is driving me nuts.

I have benn using torque 2.4 for sometime, and everything works fine, But
now my distro has updraded torque from 2.4.8 to 2.5.3, and I face a curious

I have reduced the problem to a simple test, with just one only node and
a simple and stupid queue:

Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----
std                --      --       --      --    0   0 10   E R

No limits, no nothing. Box is a quad core cpu.

With a simple job:

werewolf:~/dev/mpi/tst> cat k
#PBS -N x
#PBS -S /bin/bash
#PBS -j oe

echo "server:" $PBS_SERVER
echo "queue: " $PBS_QUEUE
echo "client:" $PBS_O_HOST
echo "cwd:   " $PBS_O_WORKDIR

echo "nodefile<"$PBS_NODEFILE">:"

sleep 30

with torque 2.4, I could do this:

werewolf:~/dev/mpi/tst> qsub -l nodes=1:ppn=2 k

(what I really do is running MPI with mpirun -pernode...)

But with torque 2.5, this does not work anymore:

erewolf:~/dev/mpi/tst> qsub -l nodes=1:ppn=2 k
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes

Uh ? What has changed ? It looks like 2.5 ignores that box has 4 cores...

Any idea ? Some behavior has changed, is it a bug, or should it work
and perhaps its a packaging/compiler issue ?


PS: attached goes info for 2.4.8 and 2.5.3, it is mainly the same, just 2.5
shows node status is reverse order...

