[torqueusers] setting a variable in prologue
Jerry Smith
jdsmit at sandia.gov
Thu Mar 23 10:19:06 MST 2006
Aquarijen wrote:
> Hi Garrick,
>
> I see that in the pbs_mom man page the variable that you refer to...
> So, I went to my /var/spool/pbs/mom_priv directory and made my config
> look like this:
>
> $logevent 127
> $loglevel 6
> $pbsserver b08l02
> $tmpdir /scratch
> $usecp *:/home /home
> $restricted b08l02,b08l01,b07l01,b07l02,b06l01,b06l02,b05l01,b05l02
>
> It is the same as how it looked before except I added the "$tmpdir
> /scratch" to it. Scratch is set up like a /tmp - the sticky bit is
> set. I pushed the config to all the nodes and restarted all the
> pbs_moms and even the pbs_server.
>
> In my simple submit script I do:
> #!/bin/bash -l
> #PBS -S /bin/bash
> #PBS -m ae
> #PBS -M tippensjl at ornl.gov
> #PBS -N parallel-worlds-jen
> #PBS -q workq
> #PBS -l nodes=2:ppn=2
> #PBS -l walltime=00:00:30,mem=1mb
> echo "Current working directory is `pwd`"
> echo $PBS_JOBID
> echo Tmp dir is $TMPDIR
> mpiexec /home/2vt/jenstests/message-passing-parallel-worlds/a.out
>
> But the variable for $TMPDIR is empty. Am I missing something?
>
> Thanks for all the help you give on this list.
>
> -Jen
>
>
>
Jen,
Have you checked the mom_logs on the allocated nodes for any info?
Do you see any lines like this after a mom restart ( or to simplify
this step instead of restarting the mom you can use : momctl -r
</path/to/config ) # has the current mom re-read the config
03/23/2006 09:52:24;0002; pbs_mom;Svr;settmpdir;/scratch3
When the job exits you will also see something like:
03/23/2006 09:58:01;0080; pbs_mom;Job;366.tadmin2;removing transient
job directory /scratch3/366.tadmin2
While testing I noticed an error I had due to not having the /scratch3
as tmpdir , I had used /scratch ( which had not been created )
03/23/2006 09:56:09;0001; pbs_mom;Svr;pbs_mom;Permission denied (13)
in TMakeTmpDir, Unable to make job transient directory: /scratch/365.tadmin2
My pbs_script:
#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -m be
#PBS -j oe
echo "tmpdir is $TMPDIR"
output is :
TOCC Torque Scheduling System
Job Id: 367.tadmin2
Username: jdsmit
tmpdir is /scratch3/367.tadmin2
my config looks like:
$logevent 0x1ff
$pbsserver tadmin2
$node_check_script /var/spool/pbs/mom_priv/node_health.sh
$node_check_interval 30
$down_on_error 1
$tmpdir /scratch3
$usecp *:/home /home
$usecp *:/projects /projects
$usecp *:/scratch1 /scratch1
$usecp *:/scratch2 /scratch2
$usecp *:/scratch3 /scratch3
$usecp *:/scratch4 /scratch4
In reference to your comment
I would appreciate help from you
creative folks on how to do something so that I can change the
underlying filesystem without hearing the screams of users.
Remember that this is a "temp" dir and will be deleted after the job
finishes.
From pbs_mom
Before job launch, MOM will append the jobid
to the tmpdir basename and create the directory.
After
the job exit, MOM will recursively delete it.
-Jerry
----------------------------------------
Jerry D. Smith
Sandia National Laboratories
More information about the torqueusers
mailing list