[torqueusers] Bugs/glitches in torque 2.3.0?

Chris Samuel csamuel at vpac.org
Wed Apr 9 07:12:10 MDT 2008


----- "Prakash Velayutham" <prakash.velayutham at cchmc.org> wrote:

> Could you explain how these 2 fit together (other than the obvious)  
> and why it won't work in this case? I am using Open MPI with Torque  
> (2.2.1) , but was planning on enabling cpusets with 2.3.0 as an  
> upgrade in the near future.

The problem is that rather than connecting to the pbs_mom
on each node once per MPI thread to be launched there (which
is what pbsdsh and the OSC mpiexec do) it connects just once
to start orted and then that spawns the number of threads via
fork().

The current cpusets model allocates one core per TM spawned
process and so all your OpenMPI processes end up fighting
over the one core they've been allocated. :-(

It would be nice if there was a way to configure OpenMPI
to connect once per task instead of once per node.

cheers!
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torqueusers mailing list