[torqueusers] can't get more than one processor per node

Donald E Tripp dtripp at hawaii.edu
Thu Jul 26 15:47:11 MDT 2007


I think the problem is in your submit file. Here is an example file you can use.

---------------------------

#!/bin/bash
#PBS -u user_name
#PBS -l nodes=1:ppn=8
#PBS -o $PBS_JOBNAME.out
#PBS -e $PBS_JOBNAME.err

#How many procs do I have?
NP=$(wc -l $PBS_NODEFILE | awk '{print $1}')

#cd into the directory where I typed qsub
cd $PBS_O_WORKDIR

#run executable
mpiexec -np $NP executable

---------------------------

This way mpirun dynamically gets the number of nodes to run. Your test script doesn't actually tell MPI how many procs to run on, so it gets spawned once per node, no matter how many procs you request. 


- Donald Tripp
 dtripp at hawaii.edu
----------------------------------------------
HPC Systems Administrator
High Performance Computing Center
University of Hawai'i at Hilo
200 W. Kawili Street
Hilo,   Hawaii   96720
http://www.hpc.uhh.hawaii.edu

----- Original Message -----
From: "Adams, Samuel D Contr AFRL/HEDR" <Samuel.Adams at BROOKS.AF.MIL>
Date: Thursday, July 26, 2007 11:26 am
Subject: [torqueusers] can't get more than one processor per node
To: torqueusers at supercluster.org

> I have 8 cores per node.  I think I have torque configured to know 
> thateach node has 8 cores.
> ## from pbsnodes ##
> prodnode3.brooks.af.mil
>     state = job-exclusive
>     np = 8
>     ntype = cluster
>     jobs = 0/52.prodnode1.brooks.af.mil, 
> 1/52.prodnode1.brooks.af.mil,2/52.prodnode1.brooks.af.mil, 
> 3/52.prodnode1.brooks.af.mil,4/52.prodnode1.brooks.af.mil, 
> 5/52.prodnode1.brooks.af.mil, 6/52.prodnode1.brooks.af.mil, 
> 7/52.prodnode1.brooks.af.mil     status = opsys=linux,uname=Linux 
> prodnode3.brooks.af.mil2.6.18-8.1.4.el5 #1 SMP Thu May 17 03:16:52 
> EDT 2007
> x86_64,sessions=3059 3205 3237,nsessions=3,nusers=1,idletime=162
> ,totmem=18472040kb,availmem=18211688kb,physmem=16440432kb,ncpus=8,loadav
> e=0.67,netload=34049366,state=free,jobs=52.prodnode1.brooks.af.mil,recti
> me=1185484866
> 
> But, whenever I try to run a job some nodes, it only runs on one
> processor for some reason.  I have this basic test script that 
> contains:
> [sam at prodnode3 all]$ cat script.sh
> #PBS -l nodes=1
> `mpirun /home/sam/code/fdtd/fdtd_0.3/fdtd -t
> /home/sam/code/fdtd/fdtd_0.3/test_files/tissue.txt -r
> /home/sam/code/fdtd/fdtd_0.3/test_files/sphere_brain_10_pad_x0110y0110z0
> 110.raw -v -f 2000 --pw 90,0,1,0`
> exit 0
> 
> I have tried running it with
> $ qsub script.sh
> $ qsub -l nodes=1:ppn=8 script
> 
> and I have tried changing the script to say mpirun -np 8 or mpiexec to
> no avail.
> 
> Does anyone know what I am doing wrong here?
> 
> Sam Adams
> General Dynamics Information Technology
> Phone: 210.536.5945
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 


More information about the torqueusers mailing list