[torqueusers] TORQUE overhead on dual-core dual-opter cluster
Ronny T. Lampert
telecaadmin at uni.de
Tue Apr 11 03:55:32 MDT 2006
> I just benchmarked the performance of program VASP on our dual-core
> Total CPU time/Elapsed time
> 4 cores 8 cores 16 cores
> interactive mode 1672s/1860s 791s/1096s 518s/1484s
> batch modeh TORQUE 1366s/2482s 820s/2975s 548s/3178s
First make sure you don't have strange background daemons running that
interfere with your measurements (like locatedb, big cron jobs, yum updates,
Second make sure you run a recent torque version. Also make sure you build
with --disable-filesync and also set
#> set server node_pack = False
via qmgr for your pbs_server to spread the jobs across the nodes (the more
jobs/node the more sharing of the I/O system).
Make sure that each CPU on the nodes only gets one job.
Third I am NOT aware of any boost of interactive VS batch jobs as in nicing
down batch jobs or similar.
Forth we really need more information to help. qmgr's "print server" output
is a good start.
Torque in itself has no "overhead", it does NOT interact with your programm.
All the pbs_moms do is communicating the state of your nodes to the
pbs_server. Then the pbs_sched (or maui or whatever) are deciding which jobs
are to run. All that is usually not noticeable even with a fair amount of
jobs queued so *I* run pbs_server / pbs_sched on the 1st of my compute nodes.
To reduce the torque status overhead you might look at those params (below
are the values I chose for my small 12 CPU cluster):
set server scheduler_iteration = 330
set server node_ping_rate = 180
set server node_check_rate = 300
set server tcp_timeout = 30
set server poll_jobs = True
set server job_stat_rate = 120
More information about the torqueusers