[torqueusers] TORQUE overhead on dual-core dual-opter cluster

Jyh-Shyong Ho c00jsh00 at nchc.org.tw
Sun Apr 9 19:20:01 MDT 2006


Hi,

I just benchmarked the performance of program VASP on our dual-core
dual-opteron
cluster with gigabit ethernet switch, and I found something difficult to
explain:

Using 1 node
                        Total CPU time/Elapsed time
                     1 core        2 cores        4 cores
interactive mode     5399s/5435s   3318s/3442s    1747s/2071s
batch modeh TORQUE   4903s/4905s   3727s/3922s    1698s/1899s

Using 2 nodes
                        Total CPU time/Elapsed time
                     2 cores        4 cores        8 cores
interactive mode     3169s/3608s   1718s/2019s     904s/1384s
batch modeh TORQUE   2704s/4248s   1617s/2271s     906s/1370s

Using 4 nodes
                        Total CPU time/Elapsed time
                     4 cores         8 cores        16 cores
interactive mode     1672s/1860s    791s/1096s      518s/1484s
batch modeh TORQUE   1366s/2482s    820s/2975s      548s/3178s

The total CPU time scales well, however, the elapsed times for test caeses
running in batch mode behaves in a strange way. It seems that the TORQUE 
overhead increases dramatically when two or more than two nodes are used,
and the performance gain by using more cores was overtaken by the TORQUE
overhead. This does not make sense, and I have not found this behavior on
single-core dual-opteron cluster.

Anyone has any clue on why this behavior occurrs?  On our cluster, each 
node has 2 dual-core opteron 275 CPUs, TORQUE treats it as a 4-way SMP 
node, each node has 4GB RAM.


Jyh-Shyong Ho, Ph.D.
Research Scientist
National Center for High Performance Computing
Hsinchu, Taiwan, ROC



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20060410/34c9359b/attachment.html


More information about the torqueusers mailing list