[torqueusers] how can use infiniband nodes in torque

Joshua Bernstein jbernstein at penguincomputing.com
Wed Sep 10 12:27:06 MDT 2008


Hi Yang,

If you are using MVAPICH, (and not MPICH), Infiniband should really give 
your application some performance advantages. Since though you aren't 
seeing any performance changes, one of two things is likely going on:

1) You think you are running over IB, when actually you are not.  Make 
sure that you have your PATH and LD_LIBRARY_PATH set correctly to point 
to the MVAPICH binaries and libraries rather then, say, the MPICH ones. 
You may even try running some simple programs outside of TORQUE to 
verify that IB and MVAPICH are actually working. I suggest the 
osu_latency.c test. When the test is running over Ethernet, Latencies 
should likely be in the 10's of microseconds. Likely something close to 
50 to 60.  When the e test runs over r IB, the times sound be in the 
single digits, say something like 5 to 6  microseconds.

2) If the runtime are close but not actually identical, there is a 
chance your code simply doesn't take advantage of IB, and likely does 
not do very much inter-node communication. Usually these codes are bound 
by disk I/O or memory access rather then communication latencies. 
Generally CFD style codes tend to scale well and will perform better 
over IB, FEA, or other type of structual codes may not scale as well, 
and may be more disk/memory I/O bound rather then network bound. 
Generally...

Also, while its true that OpenMPI generally performs better then 
MVAPICH, before going through the hassle of switching, you should first 
see if you are actually running over IB, before jumping to conclusions.

Hope that helps.

-Joshua Bernstein
Software Engineer
Penguin Computing


zhyang at lzu.edu.cn wrote:
> I have a cluster,including 14 infinband nodes ,I want to know whether effect in infiniband more fast than 1000M ethernet , I install mvaich,but I use torque submit job (use mpich2 and mavich),but I found the job running time is no diffence.I saw the document about mpich2,it said it support the infiniband. anybody used infiniband nodes? how can I using the infinband nodes? Need I try compiler the torque or mpich2¡¢mvaich(add some parametere)?
> 
> thanks
> 
> 
> yang
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list