[torqueusers] setting a variable in prologue

Jerry Smith jdsmit at sandia.gov
Thu Mar 23 10:19:06 MST 2006



Aquarijen wrote:
> Hi Garrick,
>
> I see that in the pbs_mom man page the variable that you refer to... 
> So, I went to my /var/spool/pbs/mom_priv directory and made my config
> look like this:
>
> $logevent 127
> $loglevel 6
> $pbsserver b08l02
> $tmpdir /scratch
> $usecp *:/home /home
> $restricted b08l02,b08l01,b07l01,b07l02,b06l01,b06l02,b05l01,b05l02
>
> It is the same as how it looked before except I added the "$tmpdir
> /scratch" to it.  Scratch is set up like a /tmp - the sticky bit is
> set.  I pushed the config to all the nodes and restarted all the
> pbs_moms and even the pbs_server.
>
> In my simple submit script I do:
> #!/bin/bash -l
> #PBS -S /bin/bash
> #PBS -m ae
> #PBS -M tippensjl at ornl.gov
> #PBS -N parallel-worlds-jen
> #PBS -q workq
> #PBS -l nodes=2:ppn=2
> #PBS -l walltime=00:00:30,mem=1mb
> echo "Current working directory is `pwd`"
> echo $PBS_JOBID
> echo Tmp dir is $TMPDIR
> mpiexec /home/2vt/jenstests/message-passing-parallel-worlds/a.out
>
> But the variable for $TMPDIR is empty.  Am I missing something?
>
> Thanks for all the help you give on this list.
>
> -Jen
>
>
>   
Jen,
   
    Have you checked the mom_logs on the allocated nodes for any info?

    Do you see any lines like this after a mom restart  ( or to simplify 
this step instead of restarting the mom you can use : momctl -r 
</path/to/config )  #  has the current mom re-read the config

03/23/2006 09:52:24;0002;   pbs_mom;Svr;settmpdir;/scratch3

When the job exits you will also see something like:

03/23/2006 09:58:01;0080;   pbs_mom;Job;366.tadmin2;removing transient 
job directory /scratch3/366.tadmin2

While testing I noticed an error I had due to not having the /scratch3 
as tmpdir ,  I  had used /scratch ( which had not been created )

03/23/2006 09:56:09;0001;   pbs_mom;Svr;pbs_mom;Permission denied (13) 
in TMakeTmpDir, Unable to make job transient directory: /scratch/365.tadmin2


My pbs_script:

#!/bin/bash

#PBS -l nodes=1:ppn=1
#PBS -l walltime=1:00:00
#PBS -m be
#PBS -j oe

echo "tmpdir is $TMPDIR"

output is :

TOCC Torque Scheduling System
Job Id: 367.tadmin2
Username: jdsmit

tmpdir is /scratch3/367.tadmin2



my config looks like:

$logevent 0x1ff
$pbsserver tadmin2
$node_check_script /var/spool/pbs/mom_priv/node_health.sh
$node_check_interval 30
$down_on_error 1
$tmpdir /scratch3
$usecp *:/home /home
$usecp *:/projects /projects
$usecp *:/scratch1 /scratch1
$usecp *:/scratch2 /scratch2
$usecp *:/scratch3 /scratch3
$usecp *:/scratch4 /scratch4


In reference to your comment


I would appreciate help from you
creative folks on how to do something so that I can change the
underlying filesystem without hearing the screams of users.


Remember that this is a "temp" dir and will be deleted after the job 
finishes.

 From pbs_mom

 Before job launch, MOM will append the jobid
                     to the tmpdir basename and create the  directory.   
After
                     the  job  exit,  MOM will recursively delete it.




-Jerry
----------------------------------------
Jerry D. Smith
Sandia National Laboratories





More information about the torqueusers mailing list