[torqueusers] Solution: torque-1.2.0p6 build problem using gcc 3.4 (RHEL4 or FC4)

Steffen Moeller moeller at pzr.uni-rostock.de
Tue Sep 20 04:35:23 MDT 2005


Hi, I personally do not care so much about the RPMs, or the DEBs, but
their distribution with Fedora, OpenSuse or Debian. Has there been some
progress in the discussion if such a redistribution would be appreciated?

My about a year old feedback was something alike "please not yet, we
are thinking about it". Some other feedback was that the license should
permit it already today, but if it is not appreciated then it should not
happen.

Steffen


On Tue, Sep 20, 2005 at 09:44:45AM +0100, Steve Traylen wrote:
> On Mon, Sep 19, 2005 at 10:55:34PM -0600 or thereabouts, Maestas, Christopher Daniel wrote:
> > I'm curious if there is anyone out there maintaining a standard type rpm
> > for torque.
> > I haven't seen much in the way of 1.2.0pX ... I was wondering if we
> > could get a contrib type spec file or better yet an actual working spec
> > file to be able to run "rpmbuild -tb torque-1.2.0pX.tar.gz"  I thought
> > I'd ask this, since this fix seems to refer to rpm building. :-)
> 
> There are some here.
> 
> http://quattor.web.lal.in2p3.fr/packages/mpi/
> 
> these are built for ScientificLinux 3.
> 
> Steve
> > 
> > 
> > -----Original Message-----
> > From: torqueusers-bounces at supercluster.org
> > [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Ole Holm
> > Nielsen
> > Sent: Monday, September 19, 2005 9:09 AM
> > To: torqueusers at supercluster.org
> > Subject: [torqueusers] Solution: torque-1.2.0p6 build problem using gcc
> > 3.4 (RHEL4 or FC4)
> > 
> > Dear Torque users,
> > 
> > We have previously discussed a problem starting LAM-MPI parallel jobs
> > with torque-1.2.0p6 in this thread:
> > http://www.supercluster.org/pipermail/torqueusers/2005-September/002079.
> > html
> > 
> > If you use Torque on Redhat Enterprise Linux 4, Fedora Core 4 or any
> > other system using gcc 3.4 (or later), you should know about a problem
> > caused by a new feature in gcc 3.4, as well as the solution to this
> > problem:
> > 
> > We found that the Torque build process has a problem with gcc 3.4.3,
> > namely that a "make install" will cause a second, superfluous
> > recompilation of everything.  If you're building an RPM, this causes
> > subtle problems in the resulting RPMs because some hardcoded paths may
> > be incorrect.  This was the problem that made LAM-MPI booting fail
> > because pbs_mom could not find the pbs_demux executable (see the above
> > thread).
> > 
> > The quick summary:
> > ------------------
> > 
> > 1. With Torque up to and including 1.2.0p6, a workaround is to
> >     configure Torque with an additional CFLAGS option
> >     -fno-working-directory, if your system uses gcc 3.4 or newer.
> > 2. Torque 1.2.0p7 (current snapshot and later) has a patch in
> >     buildutils/makedepend-sh which is the permanent solution,
> >     so the -fno-working-directory workaround is not needed here.
> > 
> > Additional details:
> > -------------------
> > 
> > The gcc 3.4 man-page describes a new feature:
> >        -fworking-directory
> >             Enable generation of linemarkers in the preprocessor output
> > that
> >             will let the compiler know the current working directory at
> > the
> >             time of preprocessing.  When this option is enabled, the
> > prepro-
> >             cessor will emit, after the initial linemarker, a second
> > line-
> >             marker with the current working directory followed by two
> > slashes.
> >             ...
> > 
> > This new default feature causes Torque's buildutils/makedepend-sh script
> > to add a dependency of all .o files upon the timestamp of the current
> > working directory in the Makefile, in case you use the -g flag in CFLAGS
> > (the default).  Look for the following pattern in the Makefile:
> > 
> > # DO NOT DELETE THIS LINE -- makedepend-sh depends on it
> > accounting.o: ./accounting.c
> > accounting.o: /scratch/Torque/torque-1.2.0p6/src/server//
> > 
> > The line terminated with "//" refers to the current working directory.
> > This dependency causes all .o files to be rebuilt every time you do a
> > "make" in any directory, including the case where you do a "make
> > install".
> > 
> > In the case of RPM building, this is a real problem because all files
> > will be installed into a temporary location.  The pbs_mom will now have
> > an incorrect hardcoded path to pbs_demux and pbs_rcp, for example,
> > /var/tmp/torque-1.2.0p6-buildroot/usr/sbin/pbs_demux
> > (check this by "strings /usr/sbin/pbs_mom | grep pbs_demux").
> > 
> > In this scenario all parallel jobs using the "tm" boot interface will
> > fail because the pbs_demux process failed to be started by pbs_mom.  A
> > simple test to perform is to run "pbsdsh hostname"
> > within a multi-node PBS batch job.  If pbsdsh gives error messages, you
> > may have the above problem, and other environments such as LAM-MPI using
> > the "tm" interface are going to fail as well.
> > 
> > If you want to patch your current Torque installation, here's the diff
> > (now in the CVS for 1.2.0p7) as provided by Garrick:
> > 
> > --- buildutils/makedepend-sh_orig       2005-09-18 10:04:34.000000000
> > -0700
> > +++ buildutils/makedepend-sh    2005-09-18 10:04:05.000000000 -0700
> > @@ -575,6 +575,7 @@
> > 
> >                   eval $CPP $arg_cc $d/$s $errout | \
> >                     sed -n -e "s;^\# [0-9][0-9 ]*\"\(.*\)\";$f: \1;p" |
> > \
> > +                  grep -v "$PWD//\$" | \
> >                     grep -v "$s\$" | grep -v command | grep -v built-in
> > | \
> >                     sed -e 's;\([^ :]*: [^ ]*\).*;\1;' \
> >                     >> $TMP
> > 
> > Many thanks go to Garrick Staples (USC) for much ping-pong debugging and
> > for coming up with the patch as well as the -fno-working-directory
> > workaround.
> > 
> > --
> > Ole Holm Nielsen
> > Department of Physics, Technical University of Denmark
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> > 
> > 
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> -- 
> Steve Traylen
> s.traylen at rl.ac.uk
> http://www.gridpp.ac.uk/
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list