Bug 191 - pbsdsh doesn't work if nodename differs from hostname.
: pbsdsh doesn't work if nodename differs from hostname.
Status: RESOLVED FIXED
Product: TORQUE
pbs_mom
: 4.0.*
: PC Linux
: P5 normal
Assigned To: Ken Nielson
:
:
:
  Show dependency treegraph
 
Reported: 2012-05-08 08:00 MDT by Roy Dragseth
Modified: 2012-05-09 10:45 MDT (History)
2 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Roy Dragseth 2012-05-08 08:00:42 MDT
If pbs_mom is started with a name that differs from the hostname pbsdsh stops
working across multiple nodes.

My setup has compute nodes with hostnames like compute-X-Y.local and node names
in torque where the .local domain is dropped so the hostlist has entries like
compute-X-Y.  

This used to work fine in torque 2 and 3, but on torque 4.0.1 this will make
pbsdsh hang indefinitely (and any mpi launcher using libtm).

Is it possible to have the old behaviour back?

Regards,
Roy.
Comment 1 Roy Dragseth 2012-05-08 16:21:39 MDT
Using the -A flag solved the problem (As proposed by dbeer).

r.
Comment 2 Ken Nielson 2012-05-09 10:22:52 MDT
Just to help understand David's question, the -A option was created to help
with multi-mom mode. If you use an alias name the alias must be resolvable. In
our environments we add the alias to the /etc/hosts file.
Comment 3 Roy Dragseth 2012-05-09 10:45:00 MDT
Yes, on Rocks compute-X-Y is resolvable in the cluster DNS so this works fine 
for my torque-roll setup.