[torquedev] [patch] bind to ip on multihomed pbs_servers

Garrick Staples garrick at usc.edu
Fri Feb 8 12:52:36 MST 2008


On Fri, Feb 08, 2008 at 03:11:20AM -0500, Toni L. Harbaugh-Blackford [Contr] alleged:
> On Thu, 7 Feb 2008, Garrick Staples wrote:
> 
>   >  
>   > Have you seen the new HA support in trunk?  Multiple pbs_server processes on
>   > different hosts will use different IPs and pbs_mom won't care.
>   > 
>   > 
> 
> 
> But if a server goes down while a job is running and the job completes,
> will pbs_mom know to return the job obit to *another* server?
> 
> That is what the binding of the server to a port allows.  pbs_mom picks
> up the bound address, so if it is an alias the job obit will go where
> the alias goes.  If another server gets the alias, the obit goes there.

I haven't looked at the new code, but presumable the CRI peeps are making
pbs_mom report to the new server.

It is an active/standby arrangement.  The "active" server works as normal.  The
"standby" server idles until the active server dies and clients and MOMs should
start talking to it.

Again, I'm presuming it because I haven't actually looked at the new code.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20080208/8fa370cb/attachment.bin


More information about the torquedev mailing list