[torqueusers] problem with: set queue
garrick at usc.edu
Thu Aug 11 12:51:01 MDT 2005
On Thu, Aug 11, 2005 at 01:34:00PM -0400, Stewart Samuels alleged:
> Are you saying that mom should support multiple pbs_servers?
"should" in the sense that I think it is a good idea, yes.
> Originally, PBS used to. But somewhere between the original PBS code
> and TORQUE (as up to torque-1.2.0p1 anyway), mom ONLY supports 1
> pbs_server. I have found code in mom that supports my statement. In
> fact, this is why only the first "$clienthost hostname" entry that mom
> can contact is that to which it connects. All others are ignored. If
> mom notices that a pbs_server is running on the same node that she is
> running on, mom connects to that server and ignores all entries in its
> $PBS_HOME/config "$clienthost hostname" list.
> I suspect this change over was mad to allow torque to be much more
> scalable, but I am not certain of this and it certainly is an issue when
> you want to have dual master nodes in HA mode.
I think it got broken when MOM started sending "status" updates to pbs_server
so that maui didn't have to probe every node. I think the multi-server support
has been degrading ever since.
I'm pretty sure re-adding this feature is high on CRI's wish list.
("I don't work for CRI" heresay disclaimer here).
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050811/f85ea1da/attachment.bin
More information about the torqueusers