[torqueusers] number of nodes problem - SOLVED
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Fri Oct 16 17:50:15 MDT 2009
Hi Kirill,
As far as I understand, nodes=8 implicitly means nodes=8:ppn=1, ie. 1 core on each of 8 nodes rather than 8 whole nodes. Depending on the scheduler setup this could result in 8 core on one node - so what you saw was not unexpected.
You probably had a different scheduler configuration/version after the reinstall which would account for the changed behavior.
If you want all of 8 nodes (assuming 8 cores per node) you should specify nodes=8:ppn=8.
cheers,
Gareth
________________________________________
From: Kirill Belyaev [kira at cfd.spbstu.ru]
Sent: Thursday, 15 October 2009 6:06 PM
To: torqueusers at supercluster.org
Subject: [torqueusers] number of nodes problem - SOLVED
Hi,
I posted a problem a week ago, but I have solved it already.
The cause of this problem - I made rpm and used install torque using
them. When I installed from source (make, make install), the things
started to work well.
Below is description of my problem. May be it helps someone.
Best regards,
Kirill.
--------------------------------------------------------------------
I have installed torque 2.3.7 on OpenSUSE 11.1 x86 (I want to use it
with OpenMPI). I have one pbs server (comp24) and several nodes
(comp2-23), scheduling is pbs_sched.
All looks fine except one problem:
This is my job file is:
--------------------------
#!/bin/sh
#PBS -j oe
#PBS -o test.log
#PBS -l nodes=8
sleep 10
echo `cat $PBS_NODEFILE`
--------------------------
The result of this job is:
----------------------------
comp2
----------------------------
qstat -f:
----------------------------
...
Resource_List.neednodes = 8
Resource_List.nodect = 8
Resource_List.nodes = 8
...
----------------------------
pbsnodes says that state is job-exclusive only for one node 'comp2'.
All other nodes is free.
Why it give me only one node when I requested 8?
What I should do to get all 8 nodes?
More information about the torqueusers
mailing list