[torqueusers] IB epilogoue/prolog, and any other concerns

Brock Palen brockp at umich.edu
Fri May 30 08:26:48 MDT 2008


The addition of the lockable memory is trivial.  Just put it in the  
torque init script so pbs_mom inherits the limits.  If you want good  
RDMA to work you have to have Pinned memory (from what I understand  
from talking to Jeff from Cisco)  so you are going to want this anyway.

I could ask jeff what Cisco's official plan is, but even if they  
don't recommend OMPI right now (be crazy if they didn't) it will be  
their solution in the future (I know its sun's right now for sure).   
So to not have to change everything latter I would just do OMPI right  
now, and not have to move everyone over in the future. If you want  
cisco's supported solution.

Good luck,  Listening to Dave Talk.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On May 30, 2008, at 3:39 AM, Walid wrote:

> 2008/5/29 Brock Palen <brockp at umich.edu>:
> Strange, I thought CISCO was now pushing OpenMPI,  they only employ  
> one of the main devs to make sure it works on their TopSpin IB.
>
> CISCO is big, and most likely you will get different views from the  
> inside, however is there an officel statement from CISCO or known  
> Roadmap that shows that they are favouring OpenMPI for its  
> technical merits?
>
>
>
> We run Cisco IB with OFED for the VERBS and OpenMPI.  Works well  
> little if any issues.  Just be sure your pbs_mom starts with enough  
> locked memory.
>
> Thanks, we are aware of the issue, however does not like the  
> workaround to have it set in both limits.conf, and the init script  
> of pbs_mom, is there a better standard way to ease maintance as the  
> turnaround of sysadmins where i work is a bit high, and if not  
> following standard operating system/distro procedures means going  
> back to docuemantion that could be missing/outdated :)
>
>
>
>
> If you do choose to go with the cisco VAPI verbs, note they are  
> planed to be deprecated.  everything is going OFED.
> They all work well, I use OpenMPI because it integrates with TM on  
> torque and can support myrinet as well as IB and TCP all in one lib  
> with no recompiles needed.  Many options are tweak-able at runtime,  
> which I love my self.  All of those will work though.
>
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
>
>
>
> On May 29, 2008, at 10:46 AM, Walid wrote:
>> Hi,
>>
>> I do not have a problem yet, however in the next couple of days i  
>> might be :)
>> We are deploying an IB large cluster, and with IB we have a choice  
>> of openMPI, mvapich, mvapich, among other commercial mpi  
>> implementations, our code used to run on mpich-gm over myrinet,  
>> and CISCO advice is to go with mvapich, our developers would like  
>> to have mvapcih2, and quick research on the net shows that openMPI  
>> is easier to depoly and have lesser issues
>>
>> My question is if you used any of the above and have any  
>> hints,gotches? or be aware of advice, could you please pass it on,  
>> or any useful URLs?
>> Where is the best way to set the limits of memory unlocked so that  
>> pbs mom honours it?
>> can you please share any epilogue/prolog that is needed for this  
>> kind of environments?
>>
>> regards
>>
>> Walid.
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080530/0dbaca5c/attachment.html


More information about the torqueusers mailing list