[torqueusers] job only runs on 1 cpu
jand at uvic.ca
Tue Jul 29 17:37:25 MDT 2008
Thanks all for the help. It came down to installing the OSC mpiexec. My
jobs act as expected now.
James A. Peltier wrote:
> On Mon, 28 Jul 2008, Jan Dettmer wrote:
>> Thanks for the tip.
>> I just recompiled with --with-tm.
>> Still the same problem.
>> #PBS -l nodes=1:ppn=8 will run fine (without -np option in mpiexec
>> command) on 8 cpus on one node.
>> #PBS -l nodes=2:ppn=8 will only start on one CPU on one node.
>> Cheers, Jan
> Another very simple test is to just output the contents of
> $PBS_NODEFILE to see what the PBS job thinks it has been assigned. If
> you are seeing 8 entries for each of the nodes, things should be
> working OK. If not something isn't being passed correctly.
> I would also try running the application standalone using the version
> of Open-MPI that you compiled with --with-tm and ensure it's working
> standalone. Perhaps add some debugging to the PBS submission scripts
> something like
> #PBS -l nodes=2:ppn=8
> echo "Running commands on `hostname`"
> echo "Check MPI Information"
> echo "MPIEXEC = `which mpiexec`"
> echo "MPIRUN = `which mpirun`"
> echo "MPICC = `which mpicc`"
> echo "MPIC++ = `which mpic++`"
> echo "MPIF77 = `which mpif77`"
> echo "MPIF90 = `which mpif90`"
> echo "My job has been assigned these hosts"
> cat $PBS_NODEFILE
> This will give you something to post as well to ensure things are
> working at least somewhat.
Jan Dettmer, Postdoctoral Fellow
School of Earth and Ocean Sciences, University of Victoria
Victoria BC Canada V8W 3P6
Tel (250) 472-4342; Fax (250) 472-4620
e-mail jand at uvic.ca http://web.uvic.ca/~jand/
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 312 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20080729/de592282/jand.vcf
More information about the torqueusers