Bug 70 - libcap-devel and pam-devel apparently required to build TORQUE 2.5 beta
: libcap-devel and pam-devel apparently required to build TORQUE 2.5 beta
Status: RESOLVED FIXED
Product: TORQUE
pbs_mom
: 2.5.0_beta
: PC Linux
: P5 normal
Assigned To: Garrick Staples
:
:
:
  Show dependency treegraph
 
Reported: 2010-07-08 08:27 MDT by Troy Baer
Modified: 2010-11-22 11:35 MST (History)
5 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Troy Baer 2010-07-08 08:27:38 MDT
I tried the TORQUE 2.5-beta_0.20100702 release briefly on an SGI UV
early-access system running SLES 11 and found that installing the SLES
libcap-devel and pam-devel RPMs were required to build this version of TORQUE,
which was not the case with earlier versions.  Particularly vexing is
pam-devel, as I had not enabled PAM support at configure time and yet pam-devel
was required to link pbs_mom.

Configure line:  ./configure --prefix=/opt/torque/2.5-beta_0.20100702
--disable-gcc-warnings --with-server-home=/var/spool/torque
--with-default-server=nautilus.nics.utk.edu --enable-cpuset --enable-docs
--disable-shell-pipe --enable-shell-use-argv --enable-acct-x --disable-blcr
--disable-cpa --disable-csa --with-sched=no --enable-server --enable-mom
--enable-clients --enable-drmaa --x-libraries=/usr/X11R6/lib64
Comment 1 Garrick Staples 2010-07-08 09:12:09 MDT
Can you send the output? I don't know why either library would be required to
link pbs_mom.
Comment 2 Garrick Staples 2010-07-08 09:25:41 MDT
The default is to build the pam module. Can you pass --disable-pam?
Comment 3 Garrick Staples 2010-07-08 09:30:00 MDT
Scratch that last comment. The default hasn't changed.
Comment 4 David Beer 2010-07-08 10:04:31 MDT
It may be worth noting that 2.5 isn't intended to run on an SGI system. The
NUMA branch is though, and would be more worth your time.

David
Comment 5 Troy Baer 2010-07-08 14:00:18 MDT
(In reply to comment #1)
> Can you send the output? I don't know why either library would be required to
> link pbs_mom.

I may have mistyped earlier -- it's libcap-devel that now seems to be required,
*not* libpcap-devel.

rpm -e libcap-devel

cd /opt/src/batch/torque-2.5-beta_0.20100702

make distclean

./configure --prefix=/opt/torque/2.5-beta_0.20100702 --disable-gcc-warnings
--with-server-home=/var/spool/torque
--with-default-server=nautilus.nics.utk.edu --enable-cpuset --enable-docs
--disable-shell-pipe --enable-shell-use-argv --enable-acct-x --disable-blcr
--disable-cpa --disable-csa --with-sched=no --enable-server --enable-mom
--enable-clients --enable-drmaa --x-libraries=/usr/X11R6/lib64

make
[...]
/bin/sh ../../libtool --tag=CC --mode=link gcc  -g -O2 -D_LARGEFILE64_SOURCE
-DUSEJOBCREATE   -o pbs_mom  catch_child.o mom_comm.o mom_inter.o mom_main.o
mom_server.o prolog.o requests.o start_exec.o checkpoint.o tmsock_recov.o
req_quejob.o job_func.o attr_recov.o dis_read.o job_attr_def.o job_recov.o
process_request.o reply_send.o resc_def_all.o job_qs_upgrade.o 
linux/libmommach.a -L/opt/cray/job/default/lib64 -ljob  -lutil
../lib/Libattr/libattr.a ../lib/Libsite/libsite.a ../lib/Libutils/libutils.a
../lib/Libpbs/libtorque.la 
mkdir .libs
gcc -g -O2 -D_LARGEFILE64_SOURCE -DUSEJOBCREATE -o .libs/pbs_mom catch_child.o
mom_comm.o mom_inter.o mom_main.o mom_server.o prolog.o requests.o start_exec.o
checkpoint.o tmsock_recov.o req_quejob.o job_func.o attr_recov.o dis_read.o
job_attr_def.o job_recov.o process_request.o reply_send.o resc_def_all.o
job_qs_upgrade.o  linux/libmommach.a -L/opt/cray/job/default/lib64
/usr/lib64/libjob.so -lcap -lpam -lutil ../lib/Libattr/libattr.a
../lib/Libsite/libsite.a ../lib/Libutils/libutils.a
../lib/Libpbs/.libs/libtorque.so  -Wl,--rpath
-Wl,/opt/torque/2.5-beta_0.20100702/lib
/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld:
cannot find -lcap
collect2: ld returned 1 exit status
make[3]: *** [pbs_mom] Error 1
make[3]: Leaving directory
`/opt/src/batch/torque-2.5-beta_0.20100702/src/resmom'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory
`/opt/src/batch/torque-2.5-beta_0.20100702/src/resmom'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/opt/src/batch/torque-2.5-beta_0.20100702/src'
make: *** [all-recursive] Error 1

[install libcap-devel rpm, build works]

rpm -e pam-devel

make distclean

./configure --prefix=/opt/torque/2.5-beta_0.20100702 --disable-gcc-warnings
--with-server-home=/var/spool/torque
--with-default-server=nautilus.nics.utk.edu --enable-cpuset --enable-docs
--disable-shell-pipe --enable-shell-use-argv --enable-acct-x --disable-blcr
--disable-cpa --disable-csa --with-sched=no --enable-server --enable-mom
--enable-clients --enable-drmaa --x-libraries=/usr/X11R6/lib64

make
[...]
/bin/sh ../../libtool --tag=CC --mode=link gcc  -g -O2 -D_LARGEFILE64_SOURCE
-DUSEJOBCREATE   -o pbs_mom  catch_child.o mom_comm.o mom_inter.o mom_main.o
mom_server.o prolog.o requests.o start_exec.o checkpoint.o tmsock_recov.o
req_quejob.o job_func.o attr_recov.o dis_read.o job_attr_def.o job_recov.o
process_request.o reply_send.o resc_def_all.o job_qs_upgrade.o 
linux/libmommach.a -L/opt/cray/job/default/lib64 -ljob  -lutil
../lib/Libattr/libattr.a ../lib/Libsite/libsite.a ../lib/Libutils/libutils.a
../lib/Libpbs/libtorque.la 
mkdir .libs
gcc -g -O2 -D_LARGEFILE64_SOURCE -DUSEJOBCREATE -o .libs/pbs_mom catch_child.o
mom_comm.o mom_inter.o mom_main.o mom_server.o prolog.o requests.o start_exec.o
checkpoint.o tmsock_recov.o req_quejob.o job_func.o attr_recov.o dis_read.o
job_attr_def.o job_recov.o process_request.o reply_send.o resc_def_all.o
job_qs_upgrade.o  linux/libmommach.a -L/opt/cray/job/default/lib64
/usr/lib64/libjob.so -lcap -lpam -lutil ../lib/Libattr/libattr.a
../lib/Libsite/libsite.a ../lib/Libutils/libutils.a
../lib/Libpbs/.libs/libtorque.so  -Wl,--rpath
-Wl,/opt/torque/2.5-beta_0.20100702/lib
/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld:
cannot find -lpam
collect2: ld returned 1 exit status
make[3]: *** [pbs_mom] Error 1
make[3]: Leaving directory
`/opt/src/batch/torque-2.5-beta_0.20100702/src/resmom'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory
`/opt/src/batch/torque-2.5-beta_0.20100702/src/resmom'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/opt/src/batch/torque-2.5-beta_0.20100702/src'
make: *** [all-recursive] Error 1

[reconfiguring with --disable-pam and rebuilding results in exactly the same
behavior as above]
Comment 6 Michael Jennings 2010-07-08 18:04:42 MDT
(In reply to comment #5)
> linux/libmommach.a -L/opt/cray/job/default/lib64 -ljob  -lutil
> ...
> job_qs_upgrade.o  linux/libmommach.a -L/opt/cray/job/default/lib64
> /usr/lib64/libjob.so -lcap -lpam -lutil ../lib/Libattr/libattr.a

This is because libjob requires libcap and libpam.  This really has nothing to
do with torque itself.

> [reconfiguring with --disable-pam and rebuilding results in exactly the same
> behavior as above]

Because torque isn't depending on libpam directly, but rather by way of libjob.

As I said on the ML, I'm betting there's a libjob.la that's causing this, but
even if it were just using the shared library dependencies of libjob.so to pull
in libcap and libpam, you'd still need them.

It used to be that --disable-csa would prevent "-L/opt/cray/job/default/lib64
-ljob" from being added to $MOMLIBS.  This is no longer the case, and on any
system that has a libjob (even if it's not a Cray) with a job_create()
function, libjob will be linked.  There seems to no longer be an option to
prevent this.

But the bottom line is that it's libjob, not torque, that's pulling in libcap
and libpam.
Comment 7 Garrick Staples 2010-07-08 20:33:18 MDT
I have made changes to configure to make the libjob code conditional. Please
test.
Comment 8 Glen 2010-11-18 20:42:32 MST
garrick did the work here, not sure why the bug was assigned to me
Comment 9 Ken Nielson 2010-11-19 13:00:21 MST
Did this work get checked into the build?
Comment 10 Garrick Staples 2010-11-22 11:35:54 MST
(In reply to comment #9)
> Did this work get checked into the build?

Yes, checked in. I probably didn't close the case because I never heard back
from the reporter regarding "please test."