[torquedev] pbm_mom segfault in TMomCheckJobChild

Garrick Staples garrick at usc.edu
Tue Dec 16 15:33:01 MST 2008


On Tue, Dec 16, 2008 at 12:06:30PM -0800, Joshua Bernstein alleged:
> assignment of errno may cause a problem. If you look at the spec for any 
> system call, in this case the previous read(), upon success, the value 
> of errno is undefined. Thus during a successful read() errno is *not* 
> reset back to 0, and thus errno may just hold garbage from the stack. 
> There are two fixes for this situation, I simply chose the simplest. In 
> this case, the simple thing to do, is just to set errno to 0, going into 
> the fuction before the both the select() and read(). This is the path I 
> took as it lead to the least about of code. The other option is to 
> properly recode the loop that tests for error. Line 6448 would become 
> something like:
> 
> if (i == -1)
> 	if (errno == EINTR)
>            continue;
> 
> The ordering is important.  Otherwise the compiler sees if (a && b)
> and is allowed to look at 'b' first to handle short-circuit evaluation. 

If code reorder like this (really?  this is allowedby C?) Then we have this
problem all over torque.  There are lots of files with similar constructs.

qsub.c:
          while ((i = read(d,&c,1)) != 1)
            {
            if ((i == -1) && (errno == EINTR))
              continue;


And things like this don't mean a thing!
     if (unlink(dest) != 0 && errno != ENOENT)


grep -n -r errno src | grep -v .svn | grep if | grep '&&' | grep -v pbs_errno

-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

See the Dishonor Roll at http://www.californiansagainsthate.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20081216/7d3114b8/attachment.bin


More information about the torquedev mailing list