[torqueusers] torque moms keep adding client to okclients list every minute repeatedly

Ken Nielson knielson at adaptivecomputing.com
Mon Feb 8 08:03:12 MST 2010


Rahul Nabar wrote:
> On my compute nodes the mom_logs are getting full of lines like these:
>
> 02/06/2010 00:00:00;0001;   pbs_mom;Job;is_request;is_request:
> 10.0.0.49 added to okclients
> 02/06/2010 00:00:00;0001;   pbs_mom;Job;is_request;is_request:
> 10.0.0.50 added to okclients
> 02/06/2010 00:00:00;0001;   pbs_mom;Job;is_request;is_request:
> 10.0.0.51 added to okclients
> 02/06/2010 00:00:00;0001;   pbs_mom;Job;is_request;is_request:
> 10.0.0.52 added to okclients
> 02/06/2010 00:00:00;0001;   pbs_mom;Job;is_request;is_request:
> 10.0.0.53 added to okclients
>
>
> Can't figure out what exactly is going on. Jobs seem to be running but
> still doing this. Especially each IP gets addes to okclients multiple
> times. ANd the process repeats every minute.
>
> What could I be doing wrong? Any clues?
>
>   
Each time the MOM sends and IS_HELLO message to the server it will reply 
with an IS_CLUSTER_ADDRS message. This is where the "added to okclients" 
message comes from. An IS_HELLO is generated when the MOM starts. It is 
also generated if the MOM wants to re-establish a connection with 
pbs_server.

I just did a quick check of the code and those are the two main things I 
see. There is probably one or two more reasons. In general this is not 
an error. It is just MOM staying in sync with the cluster.

How many nodes are in your cluster? What version of TORQUE are you running?

Ken Nielson
Adaptive Computing


More information about the torqueusers mailing list