[torqueusers] Re: mom segfault in new diag code
Garrick Staples
garrick at usc.edu
Thu Oct 28 13:40:02 MDT 2004
torque-1.1.0p4-snap.1098735063 isn't segfaulting, but it still isn't right...
$ momctl -d 0 -h hpc1201
Host: hpc1201/hpc1201.usc.edu Server: hpc-master Version: torque_1.1.0p4
HomeDirectory: /var/spool/torque/mom_priv
MOM active: 58 seconds
Last Msg From Server: 58 seconds (CLUSTER_ADDRS)
Last Msg To Server: 13 seconds
LOGLEVEL: 0 (use SIGUSR1/SIGUSR2 to adjust)
JobList: NONE
diagnostics complete
[ucs at hpc-master /root]$ momctl -d 1 -h hpc1201
.125.1.76,10.125.1.75,10.125.1.74,10.125.1.73,10.125.1.72,10.125.1.71,10.125.1.70,10.125.1.69,10.125.1.68,10.125.1.67,10.125.1.66,10.125.1.65,10.125.0.220,10.125.0.200,192.168.3.136,192.168.3.135,192.168.3.134,192.168.3.133,192.168.3.132,192.168.3.131,192.168.3.130,192.168.3.129,192.168.5.200,192.168.5.199,192.168.5.198,192.168.5.197,192.168.5.196,192.168.5.195,192.168.5.194,192.168.5.193,192.168.5.192,192.168.5.191,192.168.5.190,192.168.5. 189,192.168.5.188,1Trusted Client List: 10.125.2.70,10.125.2.69,10.125.2.68,10.125.2.67,10.125.2.66,10.125.2.65,10.125.2.64,10.125.2.63,10.125.2.62,10.125.2.61,10.125.2.60,10.125.2.59,10.125.2.58,10.125.2.57,10.125.2.56,10.125.2.55,10.125.2.54,10.125.2.53,10.125.2.5
And this repeats for about 18KB of garbage, all one line. I haven't looked
yet, but I assume tlist() is still broken. -d 2 and 3 do the same thing.
On Thu, Oct 28, 2004 at 09:27:08AM -0600, Dave Jackson alleged:
> Garrick,
>
> While most recent p4 snapshots fixed the tmpLine/output overflows,
> only the most recent corrects the tlist() issue. Thanks for reporting
> this and please let us know if things work better.
>
> Dave
>
> On Wed, 2004-10-27 at 19:52, Garrick Staples wrote:
> > Actually, tlist() seems to be overflowing the buffer too.
> >
> > On Wed, Oct 27, 2004 at 04:33:35PM -0700, Garrick Staples alleged:
> > >
> > > torque-1.1.0p4-snap.1098121584
> > >
> > > The new momctl diag code is segfaulting in mom_main.c:rm_request(). It seems
> > > that neither tmpLine or output are large enough. Specifically the second
> > > strcat in this code is overflowing output:
> > >
> > > if (verbositylevel >= 1)
> > > {
> > > /* display okclient list */
> > >
> > > tmpLine[0] = '\0';
> > >
> > > tlist(okclients,tmpLine,1024);
> > >
> > > strcat(output,"Trusted Client List: ");
> > >
> > > strcat(output,tmpLine);
> > >
> > > strcat(output,"\n");
> > > }
> > >
> > >
> > > --
> > > Garrick Staples, Linux/HPCC Administrator
> > > University of Southern California
> >
> >
>
--
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20041028/b83e18f9/attachment.bin
More information about the torqueusers
mailing list