[torqueusers] pbsdsh implementation bug?

Marcin Mogielnicki mar at pism.pl
Fri Jan 4 09:45:19 MST 2008


Hi all,

The case is that aux/JOBID entries does not work when used with 'pbsdsh -h'.

I'm trying to use pbsdsh -h option and I discovered that pbdsh names 
from aux/JOBID are not used. What pbsdsh does is grepping the name out 
of 'uname -a' output for every node and comparing it to -h argument. So 
implication is that hostname must be exactly the same as label given in 
torque nodes file while it is often not true. Well, if I want to execute 
something on the second node assigned for example I definitely expect 
the second entry from aux/JOBID file to work...

It doesn't look like big problem, but in fact it sometimes is. I 
encountered bunch of commercial applications requiring defining command 
for executing anything on remote nodes. For example 'rsh -l %U %H' 
should be substituted by 'pbsdsh -h %H' in one specific case. 
Application takes node names from pbs. Crash is guaranteed here.

It looks like programmer's shortcut for me, as proper long operations on 
bunch of structures, needed to catch pbs defined node name, were 
replaced by few lines only based on mostly right assumption. Mostly - 
but not always. What I'm interested in is confirming if it is considered 
buggy (i.e. design flaw) behaviour at all. Are there any plans to fix it?

Regards,

	Marcin Mogielnicki


More information about the torqueusers mailing list