Workaround Re: [torqueusers] Non-cummulative pbsnodes -o command
James J Coyle
jjc at iastate.edu
Thu Feb 14 17:04:21 MST 2008
John,
I downloaded source from cluster resources, then
tar -zxf ...
and
./configure; gmake; gmake install.
I just offered the script as a workaround, and understood that
this would not scale to thousands of nodes. (I've got 144 on the
largest cluster I manage, but run on clusters with 1000's of nodes.)
Workarounds are expected in my environment, as a way to
take the pressure off while a proper fix is made.
A workaround in no way implies that a fix is not needed
but takes the pressure off so that a proper fix can be applied.
-------------------------------------------------------------
I get
$ /usr/local/bin/pbsnodes -l
node140 offline
node141 offline
node144 offline
$ /usr/local/bin/pbsnodes -o node143
$ /usr/local/bin/pbsnodes -l
node140 offline
node141 offline
node143 offline
node144 offline
$ qmgr -c 'p s' | grep version
set server pbs_version = 2.1.2
$ ls -la /usr/local/bin/pbsnodes
-rwxr-xr-x 1 root root 50261 Oct 3 10:35 /usr/local/bin/pbsnodes
- James Coyle
> Hello James
>
> Scripting wrappers is probably a suitable solution for the current cluster
> size but I suspect that it wouldn't take long to exceed the line limitations
> of a given command. The clusters at this site are only 128 nodes but at my
> previous job, we had clusters of about 6,000 nodes, quite a change to go
> from a large production cluster to a small development cluster. My
> previous employer embedded the scheduling functionality into their own
> application but my current employer wants a more generic HPC environment.
>
> Where did you get your Torque 2.1.2 installation? Binary rpm's? Source?
> Cluster Resources? Or elsewheres? I had compiled from source downloaded
> from Cluster Resources using the defaults from the ./configure script.
>
> Regards,
> John
>
>
> On 2/14/08 12:32 PM, "James J Coyle" <jjc at iastate.edu> wrote:
>
> > John,
> >
> > I don't get this behavior (version 2.1.2), but it world be quite annoying
> > if I did.
> >
> > If you'd like a fairly easy workaround, put the following script
> > in a file ahead of /usr/local/bin in your PATH and name it pbsnodes,
> > E.g. call it /local/bin/pbsnodes
> > the issue chmod u+x /local/bin/pbsnodes
> > and then (if your in the csh or tcsh)
> > setenv PATH /local/bin:${PATH}
> > rehash
> >
> > Now pbsnodes -o
> > should work as you want it to, as pbsnodes with no -o
> > passes unchanged to /usr/local/bin/pbsnodes
> >
> > An easy mod makes this work with -d
> > once that becomes available.
> >
> >
> >
> > #!/bin/ksh
> >
> > PBSDIR=/usr/local/bin
> >
> > OFLAG_PRESENT=`echo $* | grep '\-o'`
> > if [ -n "${OFLAG_PRESENT}" ] ; then
> > ALREADY_OFFLINE="`${PBSDIR}/pbsnodes -l | awk '/offline/ {print $1}'`"
> > ${PBSDIR}/pbsnodes $* ${ALREADY_OFFLINE}
> > else
> > ${PBSDIR}/pbsnodes $*
> > fi
> >
> >
> >
>
More information about the torqueusers
mailing list