[torquedev] more issues with recent security fix

Garrick Staples garrick at clusterresources.com
Tue Oct 24 03:02:26 MDT 2006


On Tue, Oct 24, 2006 at 02:33:04AM -0600, Garrick Staples alleged:
> On Tue, Oct 24, 2006 at 09:29:57AM +0200, ?ke Sandgren alleged:
> > On Tue, 2006-10-24 at 01:10 -0600, Garrick Staples wrote:
> > > On Tue, Oct 24, 2006 at 08:08:14AM +0200, ?ke Sandgren alleged:
> > > > On Mon, 2006-10-23 at 16:46 -0600, Garrick Staples wrote:
> > > > > Turns out we aren't "there" yet.
> > > > > 
> > > > > In 2.1.5, TM is broken with single node jobs, and jobs fail to rerun.
> > > > > 
> > > > > Also, I found some similar security problems with the spool handling
> > > > > with rerunning jobs.
> > > > > 
> > > > > Here is another patch that hopefully buttons everything up.  I'm going
> > > > > to wait a few days before the next release.
> > > > > 
> > > > > Comments?
> > > > 
> > > > 
> > > > Since no file is open by root in open_std_file we could change keeping=1
> > > > to keeping=0 for everything except /dev/null in std_file_name
> > > > and in open_std_file if keeping==1 then remove O_EXCL and O_CREAT (since
> > > > it is /dev/null) and then let that lstat... S_ISREG check handle the
> > > > mode bits like it does with this last patch of yourse.
> > > > 
> > > > Like this (to be applied on top of your patch.
> > >  
> > > Don't we still need keeping=1 in the case of the explicit '-k' passed to
> > > qsub?
> > > 
> > > Though I agree with also removing O_CREAT.
> > 
> > No i don't think we need keeping=1 in that case, not from std_file_name,
> > what -k does in std_file_name is pointing the OU/ER files to $HOME and
> > then the lstat sequence in open_std_file should take care of that
> > problem. (I haven't been able to test that part since my $HOME is in
> > AFS...)
> 
> Current trunk is tested to have working TM and qmsg with single and
> multinode jobs, as batch and interactive.  The batch jobs also rerun
> correctly.
> 
> I think we are safe, and I'm not inclined to change anything else at
> this stage.

Just to clarify: I'm grumpy about this whole thing now and I want to get
2.1 and 2.0 stable, working, and relatively safe.

After that, if we need to, we can refactor stuff in trunk.  But I think
any directory structure changes should be pushed out to 3.0 (2.2 will be
a nifty feature release that should be entirely compatible with 2.1).




More information about the torquedev mailing list