[torqueusers] Fwd: [O-MPI users] OpenMPI 1.0.1 with Torque 2.0

Garrick Staples garrick at usc.edu
Tue Jan 3 13:35:13 MST 2006


On Tue, Jan 03, 2006 at 03:01:31PM -0500, Jeff Squyres alleged:
> A user recently mailed us about a problem compiling Open MPI with  
> Torque support (see http://www.open-mpi.org/community/lists/users/ 
> 2006/01/0456.php for details).
> 
> The problem is actually a build system issue.  Open MPI defaults to  
> building all of its plugins (including the TM plugins) as dynamic  
> shared objects (DSOs).  Some of these DSOs need to link against other  
> libraries -- e.g., the TM plugin needs to -lpbs.
> 
> The problem is that libpbs is a static library.  Compiling a DSO  
> against a .a file is non-portable at best.  However, in Linux/32 bit  
> environments, this seems to generally work (GNU Libtool does the  
> Right magic, but it warns against the portability issues).  In an  
> AMD64 bit environment, however, building in 64 bit mode, this doesn't  
> work, and the result is the error message shown below.
> 
> We would recompile everything with -fPIC, but don't really want to 1)  
> for the special exceptions that this would introduce into our build  
> system, and 2) for the performance penalty of compiling everything  
> with -fPIC.
> 
> Is there any way that Torque can produce shared libraries?  This  
> would solve our problem nicely.  :-)

I would LOVE to build shared libraries.  I'm sick of rebuilding maui and
perl-PBS everytime I make a tiny change in any of the client libs.

TORQUE's autotools setup is pretty messed up right now so I don't want
to make any deep changes.  But I plan to rewrite configure.in in the
near future.

TORQUE builds 6 static archives (7 if you cound pbs_sched.a), 5 of which
have impossibly generic names.  I figure we could either make
"libtorque-foo.so" libnames, or stuff everything into 1 "libtorque.so".

We need to be very careful about this because 1) TORQUE runs on a lot of
platforms, 2) we do make regular changes to the client libs, and 3) more
and more 3rd party stuff (like open-mpi) builds against TORQUE.

Do you know if other PBS implementations (I guess that means PBS Pro)
have built a shared lib?  If so, what libname(s) were used?  Maintaining
binary compatibility would be nice, but I doubt that will happen.


> Begin forwarded message:
> 
> >From: Jyh-Shyong Ho <c00jsh00 at nchc.org.tw>
> >Date: January 2, 2006 3:52:20 AM EST
> >To: users at open-mpi.org
> >Subject: [O-MPI users] OpenMPI 1.0.1 with Torque 2.0
> >Reply-To: Open MPI Users <users at open-mpi.org>
> >
> >Hi,
> >
> >I am trying to install OpenMPI 1.0.1 on my Athlon X2 computer  
> >running SuSE10.0,
> >the installation failed when I included --with-tm=/opt/torque  
> >option with the
> >error message:
> >...
> >gcc -shared  .libs/pls_tm_component.o .libs/pls_tm_module.o  -Wl,-- 
> >rpath -Wl,/home/c00jsh00/openmpi-1.0.1/orte/.libs -Wl,--rpath -Wl,/ 
> >home/c00jsh00/openmpi-1.0.1/opal/.libs -Wl,--rpath -Wl,/opt/openmpi/ 
> >lib -L/opt/torque/lib -lpbs /home/c00jsh00/openmpi-1.0.1/orte/.libs/ 
> >liborte.so -L/home/c00jsh00/openmpi-1.0.1/opal/.libs /home/c00jsh00/ 
> >openmpi-1.0.1/opal/.libs/libopal.so -lm -lutil -lnsl  -pthread -Wl,- 
> >soname -Wl,mca_pls_tm.so -o .libs/mca_pls_tm.so
> >/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse- 
> >linux/bin/ld: /opt/torque/lib/libpbs.a(tm.o): relocation  
> >R_X86_64_32S against `a local symbol' can not be used when making a  
> >shared object; recompile with -fPIC
> >/opt/torque/lib/libpbs.a: could not read symbols: Bad value
> >collect2: ld returned 1 exit status
> >make[4]: *** [mca_pls_tm.la] Error 1
> >make[4]: Leaving directory `/home/c00jsh00/openmpi-1.0.1/orte/mca/ 
> >pls/tm'
> >make[3]: *** [all-recursive] Error 1
> >make[3]: Leaving directory `/home/c00jsh00/openmpi-1.0.1/orte/ 
> >dynamic-mca/pls'
> >make[2]: *** [all-recursive] Error 1
> >make[2]: Leaving directory `/home/c00jsh00/openmpi-1.0.1/orte/ 
> >dynamic-mca'
> >make[1]: *** [all-recursive] Error 1
> >make[1]: Leaving directory `/home/c00jsh00/openmpi-1.0.1/orte'
> >make: *** [all-recursive] Error 1
> >
> >My TORQUE is 2.0.0p4, the latest version. Any hint?
> >
> >Jyh-Shyong Ho, Ph.D.
> >Research Scientist
> >National Center for High Performance Computing
> >Hsinchu, Taiwan, ROC
> >_______________________________________________
> >users mailing list
> >users at open-mpi.org
> >http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> {+} Jeff Squyres
> {+} The Open MPI Project
> {+} http://www.open-mpi.org/
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060103/45c036b4/attachment.bin


More information about the torqueusers mailing list