[torqueusers] Re: LAM-MPI won't boot with torque-1.2.0p6

garrick garrick at usc.edu
Fri Sep 16 09:15:42 MDT 2005


On Fri, Sep 16, 2005 at 11:10:08AM +0200, Ole Holm Nielsen alleged:
> garrick wrote:
> >>Speaking of a pbs_demux process, when would that be started ?
> >>It's not running on the nodes after I start an interactive PBS job.
> >
> >It is supposed to be started at the launch of all multi-node jobs.
> 
> OK, something to check for.  Now, I have a funny observation about the
> pbs_mom which I've built as an RPM using a torque.spec file adapted
> from your version.  On a compute node I look inside pbs_mom:
> 
> # strings /usr/sbin/pbs_mom | grep /usr/sbin
> /var/tmp/torque-1.2.0p6-buildroot/usr/sbin/pbs_rcp
> /var/tmp/torque-1.2.0p6-buildroot/usr/sbin/pbs_demux
> 
> Isn't that weird !  The path to pbs_demux is actually related to
> the one which used to exist during the RPM build process !

Good hunting!

Looks like the build was incomplete in the %build section, and was
completed in the %install section.  Rebuild it again, capturing the
output.  You can send it to me offlist if you want.

 
> The RPM BUILD directory (/usr/src/redhat/BUILD/torque-1.2.0p6)

Building rpms as root?  Bad bad bad!


> So should pbs_demux be moved to the torque-mom RPM, or would other
> things break if one doesn't have pbs_demux ?  For example, I don't
> plan to install torque-mom on our login nodes.  Would you be so kind
> as to offer an updated torque.spec which moves pbs_demux to the
> appropriate RPM package ?

Sure, check http://mirrors.usc.edu/usc/usclinux/3AS/source/common/ later
today.

 
> >Unfortunately the error message that should have gone to syslog when
> >pbs_demux wasn't exec'd was broken.  Funny thing, I just fixed this in
> >CVS right after 1.2.0p6 was released.
> 
> So what does a poor cluster administrator do here - download the
> latest Torque snapshot and build a new RPM with whatever SPEC-file ?

It's just an error message, it's not important.


-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050916/7adfb380/attachment.bin


More information about the torqueusers mailing list