[torqueusers] Large cluster considerations
Ronny T. Lampert
telecaadmin at gmail.com
Thu Feb 21 04:38:35 MST 2008
> set server scheduler_iteration = 300
> set server node_ping_rate = 180
> set server node_check_rate = 300
> set server tcp_timeout = 30
> set server default_node = 1
> set server node_pack = False
> set server job_stat_rate = 120
And here are my settings for a low node, but high job-rate cluster
(sometimes 20k+ jobs queued and large submit bursts, now 700k jobs thru):
set server scheduler_iteration = 300
set server node_ping_rate = 180
set server node_check_rate = 300
set server tcp_timeout = 30
set server default_node = 1
set server node_pack = False
set server job_stat_rate = 120
Also make sure your head node has a reasonably fast disk and/or hardware
RAID and hardware caches.
If you're running on Linux you might increase the dirty ratio (it's in
per cent), found in
/proc/sys/vm/dirty_ratio
It's nowadays very low at 10, you usually can savely set it to 40.
This will help with large job bursts because disk writing is/can be
delayed and more buffers are available.
Also set dirty_background_ratio to maybe 10 or so (nowadays at 5).
I also had to adjust the maui scheduling rate or the poor thing would
iterate to death.
Cheers,
Ronny
More information about the torqueusers
mailing list