[torqueusers] local scratch space allocation
Garrick Staples
garrick at usc.edu
Thu Oct 27 20:24:48 MDT 2005
On Fri, Oct 28, 2005 at 12:07:13PM +1000, Chris Samuel alleged:
> On Sat, 22 Oct 2005 04:39 am, Garrick Staples wrote:
>
> > Instead, I'm having MS create the dir at the initial job commit, before
> > the sisterhood is joined. ??MS will always create $tmpdir, while the
> > sisters will simply create it if it doesn't already exist.
>
> Hmm, is there any reason to check if it exists ?
>
> I'm just thinking that NFS can sometimes take a little time to catch up
> leading to a possible race condition if you did check, whereas mkdir(2) is
> (should be) atomic and will simply fail if the directory already exists.
>
> So you could create the directory anyway on all and if you get EEXIST
> continue, otherwise deal with whatever else happened..
>
> What do you think Garrick ?
That's actually what I ended up doing. Here's the logic that happens on
all nodes:
if mkdir $TMPDIR == success
set job flag "we own it, must delete it"
OK
else
if stat $TMPDIR == success
if $TMPDIR is directory
if $TMPDIR owned by job user
OK
else
abort job
else
abort job
else
abort job
--
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051027/f7060e19/attachment.bin
More information about the torqueusers
mailing list