[torqueusers] (no subject)

Mike Poublon michael.poublon at hope.edu
Wed Dec 7 10:46:17 MST 2005


X-EXP32-SerialNo: 00101456, 00101457, 00101458, 00101459, 00101460
Subject: HUGE log file created
Message-ID: <43A9F5D5 at hope.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Mailer: InterChange (Hydra) SMTP v3.62

If a node isn't listed in the server_priv/nodes file the server won't accept 
the node, leading to excessively large log files (2 gigs) and pbs_server 
crashing. I can duplicate the problem reliably. The large logs are created by 
mom on the node trying to check in with the server many times per second 
(1300+ on the machine I ran into this on).

Shouldn't there be a delay between connection attemps? I looked at the code in 
the src/resmom directory but am not familiar with how things work.

I know there is an easy solution to this (list all the nodes in the nodes 
file), but shouldn't mom be a little more robust?

Thanks for any input on this



More information about the torqueusers mailing list