[torqueusers] pbsdsh oddity

Chris Samuel csamuel at vpac.org
Tue Jul 15 22:23:22 MDT 2008


This caught myself and Gareth from CSIRO by surprise today.

We realised that we could easily script up a replacement
for ssh/rsh like this:

------------------8< snip snip 8<-----------------

#!/bin/bash

usage="usage: $0 <node name> <command>"

if [ $# -lt 2 ]
then
        echo $usage
        exit
fi

node=$1

shift

pbsdsh -h $node $*

------------------8< snip snip 8<-----------------

The problem was that although I had a $PBS_NODESFILE
that said:

tango088
tango085

it wouldn't work (I'd get a "tango078 not found" error
for instance).  Running it with the -v option showed:

$ pbsdsh -v -h tango088 w
pbsdsh: rescinfo from 0: Linux tango088.vpac.org 2.6.25.10 #2 SMP Thu Jul 3 16:29:21 EST 2008 x86_64:nodes=2:ppn=1,pmem=4000mb,walltime=00:10:00
pbsdsh: rescinfo from 1: Linux tango085.vpac.org 2.6.25.10 #2 SMP Thu Jul 3 16:29:21 EST 2008 x86_64:nodes=2:ppn=1,pmem=4000mb,walltime=00:10:00
[...]

So it's using the uname value of the hostname rather than
what the PBS node name is, which was a bit of a surprise. :-)

Obviously we can script around that locally, but I was
wondering if pbsdsh should be a bit smarter about its
comparisons, say by checking if the hostname given
by the has a dot and the hostname in uname has a dot
then retokenising on "." and comparing the short form
hostnames.

Thoughts ?

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torqueusers mailing list