[torqueusers] Re: mom segfault in new diag code
Dave Jackson
jacksond at supercluster.org
Thu Oct 28 17:09:08 MDT 2004
Garrick,
Our fault, we pushed out a patch which contained the bounds checking
but this patch failed to get updated on the web. The new code should
perform all required tlist bounds checking.
Dave
On Thu, 2004-10-28 at 14:07, Garrick Staples wrote:
> tlist() isn't checking the bounds of Buf correctly. As it recurses, BufSize is
> never recalculated.
>
> This seems to work correctly (but you might want only one strlen())...
>
>
> diff -ruN torque-1.1.0p4_orig/src/resmom/mom_server.c torque-1.1.0p4/src/resmom/mom_server.c
> --- torque-1.1.0p4_orig/src/resmom/mom_server.c 2004-10-25 13:11:01.000000000 -0700
> +++ torque-1.1.0p4/src/resmom/mom_server.c 2004-10-28 13:01:48.000000000 -0700
> @@ -189,10 +189,10 @@
>
> if (Buf[0] != '\0')
> {
> - strncat(Buf,",",BufSize);
> + strncat(Buf,",",BufSize-strlen(Buf));
> }
>
> - strncat(Buf,tmpLine,BufSize);
> + strncat(Buf,tmpLine,BufSize-strlen(Buf));
>
> return;
> } /* END tlist() */
>
>
> On Thu, Oct 28, 2004 at 12:40:02PM -0700, Garrick Staples alleged:
> > torque-1.1.0p4-snap.1098735063 isn't segfaulting, but it still isn't right...
> >
> > $ momctl -d 0 -h hpc1201
> >
> > Host: hpc1201/hpc1201.usc.edu Server: hpc-master Version: torque_1.1.0p4
> > HomeDirectory: /var/spool/torque/mom_priv
> > MOM active: 58 seconds
> > Last Msg From Server: 58 seconds (CLUSTER_ADDRS)
> > Last Msg To Server: 13 seconds
> > LOGLEVEL: 0 (use SIGUSR1/SIGUSR2 to adjust)
> > JobList: NONE
> >
> > diagnostics complete
> >
> > [ucs at hpc-master /root]$ momctl -d 1 -h hpc1201
> > .125.1.76,10.125.1.75,10.125.1.74,10.125.1.73,10.125.1.72,10.125.1.71,10.125.1.70,10.125.1.69,10.125.1.68,10.125.1.67,10.125.1.66,10.125.1.65,10.125.0.220,10.125.0.200,192.168.3.136,192.168.3.135,192.168.3.134,192.168.3.133,192.168.3.132,192.168.3.131,192.168.3.130,192.168.3.129,192.168.5.200,192.168.5.199,192.168.5.198,192.168.5.197,192.168.5.196,192.168.5.195,192.168.5.194,192.168.5.193,192.168.5.192,192.168.5.191,192.168.5.190,192.168.5. 189,192.168.5.188,1Trusted Client List: 10.125.2.70,10.125.2.69,10.125.2.68,10.125.2.67,10.125.2.66,10.125.2.65,10.125.2.64,10.125.2.63,10.125.2.62,10.125.2.61,10.125.2.60,10.125.2.59,10.125.2.58,10.125.2.57,10.125.2.56,10.125.2.55,10.125.2.54,10.125.2.53,10.125.2.5
> >
> > And this repeats for about 18KB of garbage, all one line. I haven't looked
> > yet, but I assume tlist() is still broken. -d 2 and 3 do the same thing.
> >
> >
> > On Thu, Oct 28, 2004 at 09:27:08AM -0600, Dave Jackson alleged:
> > > Garrick,
> > >
> > > While most recent p4 snapshots fixed the tmpLine/output overflows,
> > > only the most recent corrects the tlist() issue. Thanks for reporting
> > > this and please let us know if things work better.
> > >
> > > Dave
More information about the torqueusers
mailing list