[torqueusers] 2.5.10, does anyone have problems with pbs variables?

Sreedhar Manchu sm4082 at nyu.edu
Tue Mar 13 11:44:36 MDT 2012


Hi,

I have installed Torque 2.5.10 on our systems and since then we have had problems with PBS variables like PBS_NODEFILE, PBS_JOBID, PBS_JOBNAME, PBS_O_WORKDIR, etc. Surprisingly, on the same node it works ok for one user and it doesn't work for another user. Right now to solve this I am sourcing another script into every pbs script through qsub wrapper. But this line comes only after pbs directives. So, if anyone mentions these variables in #PBS -e and #PBS -o directives, then jobs are failing.

Is there anyone facing the same problem? I am not sure whether this issue would be taken care of if I installed 2.5.11 as I didn't see anything regarding this in the changelogs.

Here is one example. This user is always having the problems with PBS_NODEFILE variable (user runs parallel jobs with scripts in tcsh). For now, it's ok as I asked her to mention the absolute path for #PBS -e and -o lines and PBS_NODEFILE is defined by the script added to pbs script by qsub wrapper ( the added script finds the PBS_NODEFILE in /opt/torque/aux/ and exports the variable PBS_NODEFILE with the filename)

Script emails me whenever it doesn't find this variable. This is an example:


on host compute-8-2.local for the parallel job.
"env | grep PBS" output
PBS_O_QUEUE=p12
PBS_O_HOST=login-0-1.local
PBS_O_HOME=/home/gs****
PBS_O_LANG=en_US.iso885915
PBS_O_LOGNAME=gs****
PBS_O_PATH=/home/gs****/bin:/bin:/share/apps/grace/5.1.22/intel/grace/bin:/share/apps/autodocksuite/4.2.1/intel/bin:/share/apps/grace/5.1.22/intel/grace/bin:/share/apps/apbs/1.1.0/intel/bin:/share/apps/gromacs/4.0.5/intel-mvapich/bin:/share/apps/amber11/intel-mvapich/amber11//bin:/share/apps/python/2.6.4/gnu/bin:/share/apps/gromacs/4.0.5/intel-mvapich/bin:/share/apps/amber11/intel-mvapich/amber11/exe:/usr/mpi/intel/mvapich-1.1.0/bin:/share/apps/matlab/R2009b/bin:/share/apps/mathematica/7.0/Executables/:/share/apps/vmd/1.8.7/:/share/apps/molden/4.7/gnu:/share/apps/mpiexec/0.84/gnu/bin:/share/apps/gaussian/G03-E01/intel/g03:/share/apps/intel/Compiler/11.1/046/bin/intel64:/usr/kerberos/bin:/usr/java/latest/bin:/usr/local/bin:/bin:/usr/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/maui/bin:/opt/torque/bin:/opt/torque/sbin:/opt/rocks/bin:/opt/rocks/sbin:/opt/dell/srvadmin/bin:/opt/torque/bin:/opt/torque/sbin:/home/gs****/jobscript/gromacs/Hpluplus2oplsaa/:/share/apps/gromacs/4.0.5/i
 ntel-mvapich/bin:/home/gs****/coarse-grain/ElNeDyn:/home/gs****/coarse-grain:/home/gs****/jobscript/:/home/gs****/jobscript/gromacs/coarse-grain/:/home/gs****/jobscript/gromacs/:/home/gs****/nedit/:/share/apps/mmtsb_toolset/intel/bin:/share/apps/mmtsb_toolset/intel/perl:/share/apps/dssp/intel/dsspcmbi:/share/apps/python/2.6.4/gnu/bin:/share/apps/time-scapes/1.2.2/intel/test:/share/apps/time-scapes/1.2.2/intel/bin:/share/apps/pymol/1.2r3pre/gnu/bin:/home/gs****/jobscript/gromacs/Hpluplus2amber99sb/:/home/gs****/jobscript/perl/:/share/apps/dssp/intel/:/home/gs****/jobscript/autodocktools/:/share/apps/mmtsb_toolset/perl/:/home/gs****/jobscript/docking:/share/apps/tinker/4.2/intel/bin/:/home/gs****/bin
PBS_O_MAIL=/var/spool/mail/gs****
PBS_O_SHELL=/bin/bash
PBS_SERVER=*****.its.nyu.edu
PBS_O_WORKDIR=/scratch/gs****/amber/ache.somSER/mdrunE199H/qm-mm-large3WAT/methylMIG-singleRC/qmmmMD/MD.0.24.80
PBS_JOBNAME=MD.0.24.80
PBS_JOBID=219825.crunch.local
PBS_QUEUE=p12
PBS_JOBCOOKIE=C4EA940FBC845949E6DB4D1BD7855EA0
PBS_NODENUM=0


I configured torque like this:

./configure --prefix=/opt/torque --libdir=/opt/torque/lib64 --with-default-server=crunch.its.nyu.edu --with-server-home=/opt/torque --enable-docs --enable-syslog --disable-gui --enable-blcr --disable-spool --enable-cpuset --enable-geometry-requests --enable-server-xml --with-pam=/lib64/security

If it's happening to anyone else, please respond to this email.

Thanks,
Sreedhar.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120313/7da83f90/attachment.html 


More information about the torqueusers mailing list