[torqueusers] pbsdsh implementation bug?
Marcin Mogielnicki
mar at pism.pl
Fri Jan 4 09:45:19 MST 2008
Hi all,
The case is that aux/JOBID entries does not work when used with 'pbsdsh -h'.
I'm trying to use pbsdsh -h option and I discovered that pbdsh names
from aux/JOBID are not used. What pbsdsh does is grepping the name out
of 'uname -a' output for every node and comparing it to -h argument. So
implication is that hostname must be exactly the same as label given in
torque nodes file while it is often not true. Well, if I want to execute
something on the second node assigned for example I definitely expect
the second entry from aux/JOBID file to work...
It doesn't look like big problem, but in fact it sometimes is. I
encountered bunch of commercial applications requiring defining command
for executing anything on remote nodes. For example 'rsh -l %U %H'
should be substituted by 'pbsdsh -h %H' in one specific case.
Application takes node names from pbs. Crash is guaranteed here.
It looks like programmer's shortcut for me, as proper long operations on
bunch of structures, needed to catch pbs defined node name, were
replaced by few lines only based on mostly right assumption. Mostly -
but not always. What I'm interested in is confirming if it is considered
buggy (i.e. design flaw) behaviour at all. Are there any plans to fix it?
Regards,
Marcin Mogielnicki
More information about the torqueusers
mailing list