[torqueusers] setting a variable in prologue

Jerry Smith jdsmit at sandia.gov
Thu Mar 23 10:19:06 MST 2006

Aquarijen wrote:
> Hi Garrick,
> I see that in the pbs_mom man page the variable that you refer to... 
> So, I went to my /var/spool/pbs/mom_priv directory and made my config
> look like this:
> $logevent 127
> $loglevel 6
> $pbsserver b08l02
> $tmpdir /scratch
> $usecp *:/home /home
> $restricted b08l02,b08l01,b07l01,b07l02,b06l01,b06l02,b05l01,b05l02
> It is the same as how it looked before except I added the "$tmpdir
> /scratch" to it.  Scratch is set up like a /tmp - the sticky bit is
> set.  I pushed the config to all the nodes and restarted all the
> pbs_moms and even the pbs_server.
> In my simple submit script I do:
> #!/bin/bash -l
> #PBS -S /bin/bash
> #PBS -m ae
> #PBS -M tippensjl at ornl.gov
> #PBS -N parallel-worlds-jen
> #PBS -q workq
> #PBS -l nodes=2:ppn=2
> #PBS -l walltime=00:00:30,mem=1mb
> echo "Current working directory is `pwd`"
> echo $PBS_JOBID
> echo Tmp dir is $TMPDIR
> mpiexec /home/2vt/jenstests/message-passing-parallel-worlds/a.out
> But the variable for $TMPDIR is empty.  Am I missing something?
> Thanks for all the help you give on this list.
> -Jen
    Have you checked the mom_logs on the allocated nodes for any info?

    Do you see any lines like this after a mom restart  ( or to simplify 
this step instead of restarting the mom you can use : momctl -r 
</path/to/config )  #  has the current mom re-read the config

03/23/2006 09:52:24;0002;   pbs_mom;Svr;settmpdir;/scratch3

When the job exits you will also see something like:

03/23/2006 09:58:01;0080;   pbs_mom;Job;366.tadmin2;removing transient 
job directory /scratch3/366.tadmin2

While testing I noticed an error I had due to not having the /scratch3 
as tmpdir ,  I  had used /scratch ( which had not been created )

03/23/2006 09:56:09;0001;   pbs_mom;Svr;pbs_mom;Permission denied (13) 
in TMakeTmpDir, Unable to make job transient directory: /scratch/365.tadmin2

My pbs_script:


#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -m be
#PBS -j oe

echo "tmpdir is $TMPDIR"

output is :

TOCC Torque Scheduling System
Job Id: 367.tadmin2
Username: jdsmit

tmpdir is /scratch3/367.tadmin2

my config looks like:

$logevent 0x1ff
$pbsserver tadmin2
$node_check_script /var/spool/pbs/mom_priv/node_health.sh
$node_check_interval 30
$down_on_error 1
$tmpdir /scratch3
$usecp *:/home /home
$usecp *:/projects /projects
$usecp *:/scratch1 /scratch1
$usecp *:/scratch2 /scratch2
$usecp *:/scratch3 /scratch3
$usecp *:/scratch4 /scratch4

In reference to your comment

I would appreciate help from you
creative folks on how to do something so that I can change the
underlying filesystem without hearing the screams of users.

Remember that this is a "temp" dir and will be deleted after the job 

 From pbs_mom

 Before job launch, MOM will append the jobid
                     to the tmpdir basename and create the  directory.   
                     the  job  exit,  MOM will recursively delete it.

Jerry D. Smith
Sandia National Laboratories

