[torqueusers] local scratch space allocation

Garrick Staples garrick at usc.edu
Thu Oct 27 20:24:48 MDT 2005

On Fri, Oct 28, 2005 at 12:07:13PM +1000, Chris Samuel alleged:
> On Sat, 22 Oct 2005 04:39 am, Garrick Staples wrote:
> > Instead, I'm having MS create the dir at the initial job commit, before
> > the sisterhood is joined. ??MS will always create $tmpdir, while the
> > sisters will simply create it if it doesn't already exist.
> Hmm, is there any reason to check if it exists ?
> I'm just thinking that NFS can sometimes take a little time to catch up 
> leading to a possible race condition if you did check, whereas mkdir(2) is 
> (should be) atomic and will simply fail if the directory already exists.
> So you could create the directory anyway on all and if you get EEXIST 
> continue, otherwise deal with whatever else happened..
> What do you think Garrick ?

That's actually what I ended up doing.  Here's the logic that happens on
all nodes:

if mkdir $TMPDIR == success
   set job flag "we own it, must delete it"
   if stat $TMPDIR == success
      if $TMPDIR is directory
        if $TMPDIR owned by job user
           abort job
        abort job
      abort job

Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051027/f7060e19/attachment.bin

More information about the torqueusers mailing list