[torqueusers] job only runs on 1 cpu

Jan Dettmer jand at uvic.ca
Sun Jul 27 17:25:08 MDT 2008

Hi all,

I have a small cluster with 3 nodes, each node has 2 CPUs with 4 cores each.
I have been using the cluster for a few month now and it works mostly great
with pbs and open-mpi.

One problem I have been running into for a while is the following:

Starting a job with a script containing
#PBS -l nodes=1:ppn=8
works perfectly. The job starts on 1 node on all 8 cores.

#PBS -l nodes=2:ppn=8
will start the job. qstat -f tells me that it is running on 16 cores but 
checking with  "top" shows that the job is only running one 1 core on 1 
node (the node listed second in the nodes files).  I could not find 
anything in the MOM logs concerning errors.

Any help would be much appreciated.

Cheers, Jan

Jan Dettmer, Postdoctoral Fellow
School of Earth and Ocean Sciences, University of Victoria	
Victoria, BC V8W 3P6
office: (250) 472-4342	email: jand at uvic.ca

More information about the torqueusers mailing list