[torqueusers] 2 problems with torque-2.0.0p7

Martin Siegert siegert at sfu.ca
Mon Jan 30 18:22:39 MST 2006


I am ran into two problems with torque-2.0.0p7:

1) the new code in configure to compile xpbsmon is almost perfect.
However it fails on such (probably ancient) OS that have libtclx.so
instead of libtclx8.4.so, etc. The following little patch should
solve this:

--- configure.orig	Fri Jan 27 10:42:17 2006
+++ configure	Fri Jan 27 14:43:49 2006
@@ -1401,6 +1401,10 @@
     count=`/bin/ls $TCL_DIR/lib$libsuff/libtclx${TCLX_LIB_VER}.* 2> /dev/null | wc -l`
     if test "$count" -lt 1; then
+       TCLX_LIB_VER=''
+       count=`/bin/ls $TCL_DIR/lib$libsuff/libtclx.* 2> /dev/null | wc -l`
+    fi
+    if test "$count" -lt 1; then
         TCLX_LIB_VER=`echo $TCLX_LIB_VER | sed -e 's/\.//'`
         count=`/bin/ls $TCL_DIR/lib$libsuff/libtk${TCLX_LIB_VER}.* | wc -l`
         if test "$count" -lt 1; then

2) this problem has to do with multi-homed hosts and is by far more
serious as it stops me dead in my tracks:

$PBS_HOME/server_name contains "b001"
$PBS_HOME/torque.cfg contains "SERVERHOST b001"

When I submit a job with qsub it returns jobids of the form 2345.<hostname>
instead of 2345.b001. This used to work in torque-2.0.0p3 (which is the
last version I used before switching to 2.0.0p7)! Thus, this broke
somewhere in versions 2.0.0p4 - 2.0.0p7. The effect is that, e.g.,

qdel 2345

does not work anymore - I always have to enter the full jobid
2345.<hostname>, which is rather annoying and more importantly
impossible to explain to users.
I suspect that the problem is with pbs_server since qsub reads the
server_name file correctly, e.g.,

(gdb) p pbs_server
$7 = 0x805edc0 "b001"
(gdb) where
#0  pbs_connect (server=0x805edc0 "b001") at ../Libifl/pbsD_connect.c:465
#1  0x0804e515 in cnt2server (server=0x8059e40 "") at cnt2server.c:111
#2  0x0804e237 in main (argc=2, argv=0xbfffddb4, envp=0xbfffddc0)
    at qsub.c:3345
#3  0x42017589 in __libc_start_main () from /lib/i686/libc.so.6

Any help/suggestions on this second issue would be appreciated!


Martin Siegert
Head, HPC at SFU
WestGrid Site Manager
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6

