[torqueusers] job only runs on 1 cpu

James A. Peltier jpeltier at cs.sfu.ca
Mon Jul 28 18:13:36 MDT 2008


On Mon, 28 Jul 2008, Jan Dettmer wrote:

> Thanks for the tip.
>
> I just recompiled with --with-tm.
>
> Still the same problem.
>
> #PBS -l nodes=1:ppn=8 will run fine (without -np option in mpiexec command) 
> on 8 cpus on one node.
>
> #PBS -l nodes=2:ppn=8 will only start on one CPU on one node.
>
> Cheers, Jan
>

Another very simple test is to just output the contents of $PBS_NODEFILE 
to see what the PBS job thinks it has been assigned.  If you are seeing 8 
entries for each of the nodes, things should be working OK.  If not 
something isn't being passed correctly.

I would also try running the application standalone using the version of 
Open-MPI that you compiled with --with-tm and ensure it's working 
standalone.  Perhaps add some debugging to the PBS submission scripts too.

something like

#!/bin/sh
#PBS -l nodes=2:ppn=8

echo "Running commands on `hostname`"
echo "Check MPI Information"
echo "MPIEXEC = `which mpiexec`"
echo "MPIRUN = `which mpirun`"
echo "MPICC = `which mpicc`"
echo "MPIC++ = `which mpic++`"
echo "MPIF77 = `which mpif77`"
echo "MPIF90 = `which mpif90`"

echo "My job has been assigned these hosts"
cat $PBS_NODEFILE

This will give you something to post as well to ensure things are working 
at least somewhat.

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
Mobile  : 778-840-6434
E-Mail  : jpeltier at sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
MSN     : subatomic_spam at hotmail.com


More information about the torqueusers mailing list