[torqueusers] setting a variable in prologue
jdsmit at sandia.gov
Thu Mar 23 10:19:06 MST 2006
> Hi Garrick,
> I see that in the pbs_mom man page the variable that you refer to...
> So, I went to my /var/spool/pbs/mom_priv directory and made my config
> look like this:
> $logevent 127
> $loglevel 6
> $pbsserver b08l02
> $tmpdir /scratch
> $usecp *:/home /home
> $restricted b08l02,b08l01,b07l01,b07l02,b06l01,b06l02,b05l01,b05l02
> It is the same as how it looked before except I added the "$tmpdir
> /scratch" to it. Scratch is set up like a /tmp - the sticky bit is
> set. I pushed the config to all the nodes and restarted all the
> pbs_moms and even the pbs_server.
> In my simple submit script I do:
> #!/bin/bash -l
> #PBS -S /bin/bash
> #PBS -m ae
> #PBS -M tippensjl at ornl.gov
> #PBS -N parallel-worlds-jen
> #PBS -q workq
> #PBS -l nodes=2:ppn=2
> #PBS -l walltime=00:00:30,mem=1mb
> echo "Current working directory is `pwd`"
> echo $PBS_JOBID
> echo Tmp dir is $TMPDIR
> mpiexec /home/2vt/jenstests/message-passing-parallel-worlds/a.out
> But the variable for $TMPDIR is empty. Am I missing something?
> Thanks for all the help you give on this list.
Have you checked the mom_logs on the allocated nodes for any info?
Do you see any lines like this after a mom restart ( or to simplify
this step instead of restarting the mom you can use : momctl -r
</path/to/config ) # has the current mom re-read the config
03/23/2006 09:52:24;0002; pbs_mom;Svr;settmpdir;/scratch3
When the job exits you will also see something like:
03/23/2006 09:58:01;0080; pbs_mom;Job;366.tadmin2;removing transient
job directory /scratch3/366.tadmin2
While testing I noticed an error I had due to not having the /scratch3
as tmpdir , I had used /scratch ( which had not been created )
03/23/2006 09:56:09;0001; pbs_mom;Svr;pbs_mom;Permission denied (13)
in TMakeTmpDir, Unable to make job transient directory: /scratch/365.tadmin2
#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -m be
#PBS -j oe
echo "tmpdir is $TMPDIR"
output is :
TOCC Torque Scheduling System
Job Id: 367.tadmin2
tmpdir is /scratch3/367.tadmin2
my config looks like:
$usecp *:/home /home
$usecp *:/projects /projects
$usecp *:/scratch1 /scratch1
$usecp *:/scratch2 /scratch2
$usecp *:/scratch3 /scratch3
$usecp *:/scratch4 /scratch4
In reference to your comment
I would appreciate help from you
creative folks on how to do something so that I can change the
underlying filesystem without hearing the screams of users.
Remember that this is a "temp" dir and will be deleted after the job
Before job launch, MOM will append the jobid
to the tmpdir basename and create the directory.
the job exit, MOM will recursively delete it.
Jerry D. Smith
Sandia National Laboratories
More information about the torqueusers