[torqueusers] files being written to wrong batch node.

Jack Wilkinson jwilkinson at stoneeagle.com
Mon Oct 22 15:48:11 MDT 2012


We've just configured a development batch farm so that our development folk don't trash the production environment.

It's one head box and two batch boxes.  All running Centos 6.3.  Configured with torque 2.57 and maui 3.3-4.

Everything is running as expected except for the one following issue.  In listings four and five, notice, that the "11111111" and "22222222" are correctly attached to the appropriate submit file, however, the results in the output files show to have been run on the "opposite" node than was requested.  Then looking at listings six and seven, from the batch boxes, the file names that were written to those boxes are the reverse from the requested host, EXCEPT that the content of the host file shows that it was run on the host that it is being listed on.

This is utterly screwy!!

Anyone have any idea?

Kind regards,
jack
________________________________
one
$ cat go.sh
qsub one-1.sbm
qsub one-2.sbm
________________________________
two
$ cat one-1.sbm
#!/bin/bash

#PBS -N testone-1.1234

#PBS -l nodes=1:ppn=1
#PBS -l nodes=srvdevbatch01

###PBS -m e
###PBS -M jwilkinson

#PBS -o /home/jwilkinson/onetest/one-1.out
#PBS -e /home/jwilkinson/onetest/one-1.err

#PBS -l nice=19
#PBS -l walltime=00:01:00

hostname
hostname > one-1.host
echo "11111111111111111111111111111111"
date
ls -lRa /SRVFS/dev-bogner | wc
ls -lR /SRVFS/dev-bogner/PRINT > one-1.ls
sleep 15
date
exit 0
________________________________
Three
$ cat one-2.sbm
#!/bin/bash

#PBS -N testone-2.1234

#PBS -l nodes=1:ppn=1
#PBS -l nodes=srvdevbatch02

###PBS -m e
###PBS -M jwilkinson

#PBS -o /home/jwilkinson/onetest/one-2.out
#PBS -e /home/jwilkinson/onetest/one-2.err

#PBS -l nice=19
#PBS -l walltime=00:01:00

hostname
hostname > one-2.host
echo "22222222222222222222222222222222"
date
ls -lRa /SRVFS/dev-bogner | wc
ls -lR /SRVFS/dev-bogner/PRINT > one-2.ls
sleep 15
date
exit 0
________________________________
four
$ cat one-1.out
srvDevBatch02
11111111111111111111111111111111
Mon Oct 22 14:43:02 CDT 2012
   3280   20241  174637
Mon Oct 22 14:43:18 CDT 2012
________________________________
five
$ cat one-2.out
srvDevBatch01
22222222222222222222222222222222
Mon Oct 22 14:43:02 CDT 2012
   3280   20241  174637
Mon Oct 22 14:43:18 CDT 2012
________________________________
six
On srvbatch01:
$ ls -l
-rw-rw-r--. 1 jwilkinson jwilkinson    14 Oct 22 14:43 one-2.host
-rw-rw-r--. 1 jwilkinson jwilkinson 39072 Oct 22 14:43 one-2.ls
$ cat one-2.host
srvDevBatch01
________________________________
seven
On srvbatch02:
$ ls -l
-rw-rw-r--. 1 jwilkinson jwilkinson    14 Oct 22 14:43 one-1.host
-rw-rw-r--. 1 jwilkinson jwilkinson 39072 Oct 22 14:43 one-1.ls
$ cat one-1.host
srvDevBatch02


Jack Wilkinson, Programmer
Services | VPay(r)
P: 972.367-6622
jwilkinson at stoneeagle.com<mailto:jwilkinson at stoneeagle.com>
www.stoneeagle.com<http://www.stoneeagle.com/>
www.vpayusa.com<http://www.vpayusa.com/>

111 W. Spring Valley Rd., #100
Richardson, TX 75081

CONFIDENTIALITY NOTICE: This email, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure, or distribution is prohibited. If you received this email and are not the intended recipient, please inform the sender by email reply and destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20121022/626e602a/attachment-0001.html 


More information about the torqueusers mailing list