[torqueusers] job only runs on 1 cpu

Jan Dettmer jand at uvic.ca
Sun Jul 27 17:25:08 MDT 2008


Hi all,

I have a small cluster with 3 nodes, each node has 2 CPUs with 4 cores each.
I have been using the cluster for a few month now and it works mostly great
with pbs and open-mpi.

One problem I have been running into for a while is the following:

Starting a job with a script containing
#PBS -l nodes=1:ppn=8
works perfectly. The job starts on 1 node on all 8 cores.

However
#PBS -l nodes=2:ppn=8
will start the job. qstat -f tells me that it is running on 16 cores but 
checking with  "top" shows that the job is only running one 1 core on 1 
node (the node listed second in the nodes files).  I could not find 
anything in the MOM logs concerning errors.

Any help would be much appreciated.

Cheers, Jan

-- 
Jan Dettmer, Postdoctoral Fellow
School of Earth and Ocean Sciences, University of Victoria	
Victoria, BC V8W 3P6
office: (250) 472-4342	email: jand at uvic.ca
http://web.uvic.ca/~jand/


More information about the torqueusers mailing list