[torqueusers] torque moms keep adding client to okclients list every minute repeatedly

Rahul Nabar rpnabar at gmail.com
Mon Feb 8 09:59:25 MST 2010


On Mon, Feb 8, 2010 at 9:03 AM, Ken Nielson
<knielson at adaptivecomputing.com> wrote:

Thanks Ken for your explanation!
>
> Each time the MOM sends and IS_HELLO message to the server it will reply
> with an IS_CLUSTER_ADDRS message. This is where the "added to okclients"
> message comes from. An IS_HELLO is generated when the MOM starts. It is also
> generated if the MOM wants to re-establish a connection with pbs_server.
>
> I just did a quick check of the code and those are the two main things I
> see. There is probably one or two more reasons. In general this is not an
> error.

Ok. I was worried because it lines 2 lines for each of 300 nodes every
other minute. That makes the mom_logs huge.

> It is just MOM staying in sync with the cluster.

Sorry, I didn't understand the question. The moms seem to be working
and running jobs. How do I check if or not they are in sync?

>
> How many nodes are in your cluster?

We have ~300 nodes.

>What version of TORQUE are you running?

Torque 2.3.3.

It does this the first time:

02/08/2010 00:00:02;0002;   pbs_mom;Svr;Log;Log opened
02/08/2010 00:00:02;0002;
pbs_mom;n/a;mom_server_check_connection;sending hello to server
euadmin
02/08/2010 00:00:02;0002;   pbs_mom;n/a;mom_server_update_stat;status
update successfully sent to euadmin
02/08/2010 00:00:02;0008;   pbs_mom;Job;do_rpp;got an inter-server request
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;stream 0 version 1
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;command 2,
"CLUSTER_ADDRS", received
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.1 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.2 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.3 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.4 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.5 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.6 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.7 added to okclients
02/08/2010 00:00:02;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.8 added to okclients

And then repeated blocks of this every 2 minutes:

02/08/2010 00:09:47;0002;   pbs_mom;n/a;mom_server_update_stat;status
update successfully sent to euadmin
02/08/2010 00:10:32;0002;
pbs_mom;n/a;mom_server_check_connection;sending hello to server
euadmin
02/08/2010 00:10:32;0002;   pbs_mom;n/a;mom_server_update_stat;status
update successfully sent to euadmin
02/08/2010 00:10:32;0008;   pbs_mom;Job;do_rpp;got an inter-server request
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;stream 0 version 1
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;command 2,
"CLUSTER_ADDRS", received
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.1 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.2 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.3 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.4 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.5 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.6 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.7 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.8 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.9 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.10 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.11 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.12 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.13 added to okclients
02/08/2010 00:10:32;0001;   pbs_mom;Job;is_request;is_request:
10.0.0.14 added to okclients

-- 
Rahul


More information about the torqueusers mailing list