[torquedev] proposed change in directory structure

Garrick Staples garrick at usc.edu
Fri Jul 11 14:43:53 MDT 2008


On Fri, Jul 11, 2008 at 04:28:43PM -0400, Glen Beane alleged:
> I've been working on some changes in trunk that transfer the .OU and .ER
> spool files from pbs_mom back to pbs_server. This is one of the steps we
> need to take so that a job in the COMPLETE state can be restarted from a
> checkpoint file.  (the files are only returned to the server if
> keep_completed is positive and the job has a checkpoint file)
> 
> There are problems when the spool file is shared between pbs_server and the
> mother superior pbs_mom. What happens is that when the files are "returned"
> pbs_server takes ownership of the .ER and .OU files in the spool dir and
> when pbs_mom forks to the user to copy the files back to the user home
> directory they are unable to do so because of a permission denied error.  I
> feel that the cleanest solution is to just separate the pbs_server and
> pbs_mom spool directories.  In my current working copy of trunk I have
> changed pbs_server to use server_home/server_spool instead of
> server_home/spool.  pbs_mom continues to use server_home/spool.  This solves
> my problems because when the spool files are returned to pbs_server pbs_mom
> retains its copy it its own spool directory. It is then free to fork to the
> user to copy the files and then delete them.
> 
> Are there any objections to this change in trunk? (the change will be
> introduced with the release of TORQUE 2.4.0)

So we're doing a useless copy from server_home/spool to
server_home/server_spool?   At my site, these files are often a significant
percentage of the filesystem.  If a file is more than 50% of the total
filesystem, then this is going to fail.

Why not just have the server check if it already has the file and not issue a
copy request?


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20080711/df10289b/attachment.bin


More information about the torquedev mailing list