[torqueusers] problem with jobs sharing cores

Fotis Georgatos fotis at cern.ch
Sat Feb 11 12:53:24 MST 2012


Hi Mike,

I had to debug a problem during last week which appears somewhat related;
in short, the mpi stack (openmpi) was intervening in cpu affinity.

I was able to solve it in my case with the following line:
"mpiexec --report-bindings --cpus-per-rank 4 -np ..."
In your case I recommend a check on the equivalent FAQ of your mpi stack like:
http://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.4

 From time to time you would like to check that your scheduler is actually
placing jobs on nodes as you imagine it would; this tool would help in this:
http://fotis.web.cern.ch/fotis/QTOP/
(tarball works fine in userspace, rpm & repo are available for sysadmins).

enjoy,
Fotis

On 10/02/2012 01:20, torqueusers-request at supercluster.org wrote:
> From:torqueusers-bounces at supercluster.org  [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Zulauf, Michael
> Sent: Thursday, February 09, 2012 12:30 PM
> To:torqueusers at supercluster.org
> Subject: [torqueusers] problem with jobs sharing cores
>
> Hi all. . .
>
> I apologize if this message appears more than once - there was an issue with my email address and list registration (which I hope is now fixed), and so I'm having to resend this. . .
>
> Anyway, where I work, we've had a problem for a while that we haven't been able to resolve.  I'm not certain of the cause - if it's related to Torque, or Maui, or something else.  But here goes. . .
>
> We've got a small cluster of 16 nodes, each with dual hex-core processors.  12 cores per node, 192 cores total.  The problem is that if I launch small jobs, where multiple jobs should be able to share a node without sharing cores, I instead get cores that are running more than one process, while other cores are idle.  The primary executable is WRF (weather prediction model), but the problem occurs for other parallel codes.  The codes have been built to utilize MPI (not OpenMP, or MPI/OpenMP).
>
> As an example, if I launch a series of jobs which request 4 cores each, I get 3 jobs assigned to each node.  That should be fine, as each node has 12 cores, and there should be no need to share cores.  Instead, I get 4 "overloaded" cores (each running 3 processes) and 8 idle cores.  Obviously not an ideal situation.  If I submit only a single small job, in which case it's alone on a node, then it runs great.  Similarly, if I launch a large job which spans more than one node, it also works well - as long as it's not sharing nodes with other jobs.  The problem only occurs (and always occurs) when parallel jobs share a node.  BTW, the qsub command does not explicitly request specific cores, or anything like that.
>
> I'm not the administrator - just the primary user.  The administrator (who was not previously familiar with Torque/Maui) has been struggling with this for a bit, and is rather busy with other duties, so I thought I'd check in here to see if anybody had suggestions I could pass along.
>
> Here are some specifics, as far as I know them:
>        HP blade hardware
> dual Intel Xeon X5670 processors
>        Infiniband interconnect (not an issue in this case?)
> the CentOS equivalent of Red Hat 4.1.2-48 (not sure of what that is exactly)
> Torque 3.0.2
> mvapich2-1.7rc1
> PGI7.2-5 compilers
> WRF 3.3.1
>
> Any thoughts?  I've probably left out relevant information.  If so, please ask for clarification.
>
> Thanks,
> Mike
>
> --
> Mike Zulauf
> Meteorologist, Lead Senior
> Asset Optimization
> Iberdrola Renewables
> 1125 NW Couch, Suite 700
> Portland, OR 97209
> Office: 503-478-6304  Cell: 503-913-0403

-- 
echo "sysadmin know better bash than english" | sed s/min/mins/ \
	| sed 's/better bash/bash better/' # Yelling in a CERN forum


More information about the torqueusers mailing list