[torqueusers] Code seems to run slow when run with pbs script
James J Coyle
jjc at iastate.edu
Thu Nov 30 09:15:35 MST 2006
When the job is running, login to the first node listed for your job
when you use qstat -n node (if you can) and use top.
My guess is that you'll see all your MPI processing running on the
Tourque will assign you nodes to use, but you have to tell MPI
that you want to use the nodes.
My guess is hat you have not included
on the mpirun command.
If not, then you are using the default created by PGI, and may very
well just be localhost. This means when you run with -np 2 through 64 they
all just run on the single node, with no speedup.
My guess is that all your cluster nodes are single CPU so you get 14 sec.
per iteration, and when this same thing happens on your dual CPU host,
you see the 2x performance improvement due to the 2 processors.
Other (less likely) possibilities are that there are orphaned processes
from previous MPI runs or some other user is running on your nodes by
using a machinefile other than his/her own PBS_NODEFILE.
James Coyle, PhD
SGI Origin, Alpha, Xeon and Opteron Cluster Manager
High Performance Computing Group
235 Durham Center
Iowa State Univ. phone: (515)-294-2099
Ames, Iowa 50011 web: http://jjc.public.iastate.edu
> Ive got an issue that is odd.
> When we submit a job inside of a pbs script it seems to run pretty slow.
> For example a single iteration of this code takes about 7 seconds on a dual
> cpu opteron at 2.6Ghz.
> When running on a Cluster with 2.4Ghz cpus inside of a pbs script a single
> iteration takes approximatly 14 seconds. It dosnt matter how many nodes the
> job is run on. 2 - 64 gets the same times per iteration.
> But if we run manually with mpirun it takes about 1 tenth of a second.
> Any ideas?
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers