[torqueusers] Security Vulnerability in Torque

Garrick Staples garrick at clusterresources.com
Fri Oct 20 17:51:04 MDT 2006

On Sat, Oct 21, 2006 at 12:45:30AM +0100, David Golden alleged:
> On 2006-10-20 16:06:42 -0600, Garrick Staples wrote:
> > > Hmmm. Are --disable-spool torque builds still vulnerable ?
> > 
> > I wouldn't think so, but I haven't confirmed that.
> >
> okay, well, here's another eyeball on it, though it's 
> past midnight here, someone-else-again might want to confirm!:
> --disable-spool builds shouldn't be vulnerable to this problem,
> anyway:
> "keeping" is always forced to 1 in the --disable-spool case in
> start_exec.c/std_file_name(), and the system thus always set[eg]ids 
> to the user before opening the spool files in the user's home dir
> in start_exec.c/open_std_file() , and always forks to the
> user in requests.c/req_cpyfile() before copying back the spool files ?
> So, well, you could still sequence-predict, but 
> it shouldn't get you anywhere in the --disable-spool case.

I agree.  With --disable-spool, a user could only "hack" himself.

> re style of fix:
> * I guess one part of the fix might be to sete[ug]id before open
> even in the keeping=0 case... I don't see why one wouldn't? (except of 
> course on systems that don't support sete[ug]id at all...). Does
> that in fact eliminate any problem?

This does exactly that:

> * sequence-prediction: I'm not sure about the whole randomising 
> the name thing.  - Yeah, it'll be less predictable, so in principle
> it's a good thing from a security perspective:  But  I know that skins
> have been saved here a couple of times now in the face of system failure 
> by the predictability of such names, for partial job output recovery.

Hrm, interesting idea; though we'd need to make sure sequence numbers are
unique over time.  Is there a computer scientist in the house willing to
come up with a good algo?

> * while it's likely good to check if the file exists, I'm not so 
> sure about "simply removing" the file if it already exists (as 
> suggested somewhere) - might be much better to signal  an error 
> and abort - there's some funny business going  on in that case...
> or is there? - what does the mom do for rerun, suspended and
> checkpointed jobs, and after a mom polling restart is the goal also to
> reopen the same files? I suspect yes, so you might want a predictable name
> , or at least persistent tracking of an unpredictable one (as Luis 
> suggests)) - but that's still going to complicate rescue of
> partial job outputs in the face of system failure, unless that
> persistent tracking table is easily human-readable (perhaps only
> by privileged users...)

I'm divided on removing the malicious link on the fly.  On the one hand,
we want to raise a giant red flag to alert the admin.  On the other
hand, we don't want a malicious user to break other users' job.

More information about the torqueusers mailing list