[torqueusers] cpusets
Martin Siegert
siegert at sfu.ca
Wed Nov 30 12:14:23 MST 2011
Hi,
we just recently started using cpusets and I do not have much experience
with them. However, by now I noticed several times that MPI jobs
(openmpi with TM) slow down dramatically: apparently two processes
are using the same core (i.e., both only get 50% cpu usage) even though
the number of cores in the cpuset equals the number of processes
of the mpi job on the particular node.
E.g.,
top - 11:05:24 up 42 days, 22:43, 2 users, load average: 6.99, 6.93, 6.68
Tasks: 468 total, 8 running, 460 sleeping, 0 stopped, 0 zombie
Cpu(s): 24.9%us, 0.2%sy, 0.0%ni, 74.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 24675188k total, 12099684k used, 12575504k free, 69968k buffers
Swap: 16777208k total, 29932k used, 16747276k free, 9946292k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3717 user1 25 0 183m 91m 14m R 100.0 0.4 15:43.62 Clark
4526 user2 25 0 109m 36m 3088 R 100.0 0.2 2:02.43 mdrun
15863 user3 25 0 459m 163m 15m R 100.0 0.7 711:26.30 wrfm_arw.exe
15864 user3 25 0 452m 156m 15m R 100.0 0.6 688:28.80 wrfm_arw.exe
4562 user2 25 0 109m 36m 3088 R 99.7 0.2 0:23.02 mdrun
15861 user3 25 0 462m 165m 15m R 50.2 0.7 510:02.12 wrfm_arw.exe
15862 user3 25 0 465m 169m 15m R 49.9 0.7 446:21.37 wrfm_arw.exe
root at b311:~> cat /proc/15861/cpuset
/torque/4913985.b0
root at b311:~> cat /proc/15862/cpuset
/torque/4913985.b0
(same for 15863, 15864) and
root at b311:~> ls /dev/cpuset//torque/4913985.b0
68 cpu_exclusive memory_pressure notify_on_release
69 cpus memory_spread_page sched_relax_domain_level
70 mem_exclusive memory_spread_slab tasks
71 memory_migrate mems
root at b311:~> cat /dev/cpuset/torque/4913985.b0/cpus
0-1,4,8
Do processes within a cpuset get bound to a particular cpu?
If yes, how do I find out which one?
Anyway, if you have na idea what could be causing this and how to
solve this problem, please let me know.
Thanks!
Cheers,
Martin
--
Martin Siegert
Simon Fraser University
More information about the torqueusers
mailing list