[torquedev] pbm_mom segfault in TMomCheckJobChild

Garrick Staples garrick at usc.edu
Tue Dec 16 14:46:28 MST 2008


On Tue, Dec 16, 2008 at 12:06:30PM -0800, Joshua Bernstein alleged:
> 	The first patch is to src/lib/Libnet/net_server.c. It seems it was 
> already applied to the 2.4.0 branch, but for whatever reason is still 
> left uncorrected in the 2.3.6 snapshot. I also talked about this fix in 
> my original post:
> 
> -        close(i);
> -
> -        num_connections--;  /* added by CRI - should this be here? */
> -
> +        close_conn(i);
> 
> Basically close(i) is incorrect, as well as the decrementing of 
> num_connections. close_conn(i) is used elsewhere, has the added benefit 
> of clearing out some relevant data structures and also takes care of 
> decrementing num_connections. Thus the fix involves changing the close() 
> to a close_conn() and removing the decrementing of num_connections. The 
> reset of net_server.c seems to relatively unchanged between 2.4.0 and 
> 2.3.6 (other then some formatting changes) and there should be low risk 
> applying this patch. Though this fix didn't seem to make the SegFault 
> disappears it did decrease the rate at which fd's were eaten up. My 
> patch is attached to this e-mail as net_server.closeconn.patch.

I'm looking closer at this and it seems like the lower part of close_conn(),
the part that does the actual work, never gets called because cn_active==Idle.

cn_active is checked before we get to this section.

if (FD_ISSET(i,&selset))
   if (svr_conn[i].cn_active != Idle)
       svr_conn[i].cn_func(i);
   else
       close_conn(i);
   fi
fi



close_conn() {
     if (svr_conn[sd].cn_active == Idle)
          return;

     close(sd);
     ...
}


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

See the Dishonor Roll at http://www.californiansagainsthate.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20081216/f1d947ee/attachment.bin


More information about the torquedev mailing list