[torquedev] [patch] bind to ip on multihomed pbs_servers

Toni L. Harbaugh-Blackford [Contr] harbaugh at ncifcrf.gov
Fri Feb 8 01:40:21 MST 2008


On Fri, 8 Feb 2008, Henning Glawe wrote:

  > On Fri, Feb 08, 2008 at 03:01:26AM -0500, Toni L. Harbaugh-Blackford [Contr] wrote:
  > > I also have a patch, but mine is more invasive, modifying the svr_connect() 
  > > and client_to_svr() functions by adding the ip address to bind to as a passed 
  > > argument.  Your patch is much simpler, so I hope it makes it in.
  > 
  > well, my patch is more a proof-of-concept, as it is an unclean solution
  > communicating the IP to the relevant functions by a global variable...
  > ultimately, it should be done the way you did it. but this would change the
  > api of libtorque, and i do not know how much software is out there which has
  > to be modified in this case... 
  > could you submit your patch, too?

I would like to but I don't think I can do it right now.  I have to clean it
up, and currently I am so swamped with work I can't get a chance to do it.
Also, I have not fully tested all the scenarios, which I hope to do with some
test systems soon.  Currently I have the code in production, so I have to be
careful with changes.

The whole reason why I created my patch was that in the case of a server alias,
the job obituaries were not getting returned to the 'failover' server if the
original server failed.  In fact, they don't get returned at all and pbs_mom 
goes into a crazy state trying to return the obits; the whole PBS system
gets bogged down and unresponsive.  If the HA changes fix this, I might want
to spend more time testing the new version of PBS on my test systems rather
than refining a patch for an old version.

Toni


  > 
  > -- 
  > c u
  > henning
  > _______________________________________________
  > torquedev mailing list
  > torquedev at supercluster.org
  > http://www.supercluster.org/mailman/listinfo/torquedev
  > 

-------------------------------------------------------------------
Toni Harbaugh-Blackford                       harbaugh at ncifcrf.gov
System Administrator
Advanced Biomedical Computing Center (ABCC)
National Cancer Institute
Contractor - SAIC/Frederick


More information about the torquedev mailing list