[torqueusers] [OFFTOPIC] List of discussion or documentation on infiniband

Greenseid, Joseph M (IS) Joseph.Greenseid at ngc.com
Wed Jul 8 07:48:28 MDT 2009


While giving Jason's response a "yeah, I think that, too," I would also suggest checking to see if you got your IB stack from your vendor.  Some vendors distribute a specialized software/driver set that they tweak to tune specifically to their gear.  It's usually based on the OFED stack from Open Fabrics in my experience, but if they've made changes, then you could/should hit them up for support.
 
--Joe

________________________________

From: torqueusers-bounces at supercluster.org on behalf of Jason Williams
Sent: Wed 7/8/2009 9:43 AM
To: ChrisJob.fr at gmail.com
Cc: torqueusers at supercluster.org
Subject: Re: [torqueusers] [OFFTOPIC] List of discussion or documentation on infiniband



Hey Chris,

One of the major players out there in the Infiniband world is the Open
Fabrics Alliance. (http://www.openfabrics.org <http://www.openfabrics.org/> ).  There should be some
docs and mailing lists on the site that you could check out.

Also, you might want to figure out what MPI libraries you are using and
check the website for them.

One last suggestion is to find out who your IB Card and Switch provider
is and maybe get them in on a service call.

To me, it sounds like you are having a problem with your IB Fabric
Subnet Manager.  I know some switches out there have this sort of
problem, but I don't want to get too deep into because this is
technically off topic for this list.

--
Jason Williams


ChrisJob.fr wrote:
>    Hi
>
>    We have an infiniband HPC cluster. Sometimes we have problem with MPI
> programs and we must restart the infiniband. After everything is OK for
> 2 weeks.
>    Do you know where I can find a discussion list about infiniband ? Or
> documention on the subject ?
>
> Thank you for yout help
> Chris
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090708/141736ce/attachment.html 


More information about the torqueusers mailing list