[torqueusers] reading the "Time Use" in qstat
bill
cluster.bill at alinto.com
Wed Sep 13 03:47:19 MDT 2006
Garrick Staples a écrit :
>>
>>At some point I think torque qstat started reporting CPU time used
>>in that column, rather than job wall time used. Maybe it always did, and
>>I'm (ahem) "misremembering" somehow. So maybe the latter job isn't
>>using much cpu time. (Some old versions of torque also had time reporting
>>bugs, but chances are you're not running one of 'em, especially since
>>one of the jobs has a real-looking figure...)
>
which versions of torque are known to have bugs like this? I compile
from source torque-2.1.0p0.
> 'qstat' has always had CPU time. 'qstat -a' has always had walltime.
>
I think there is a problem. I saw the problem again (and this time, it's
not a stuck job)
$ qstat
Job id Name User Time Use S Queue
------------------- ---------------- ---------------- -------- - -----
1569.ml lr-lock simu1 00:00:00 R batch
and
qstat -a show me 1h30 of walltime. I got a lot of CPU eaten as shown by top:
top - 11:31:35 up 11 days, 19:06, 9 users, load average: 1.78, 1.82, 1.81
Tasks: 178 total, 3 running, 175 sleeping, 0 stopped, 0 zombie
Cpu0: 0.0% us, 0.0% sy, 0.0% ni, 97.7% id, 0.0% wa, 0.3% hi, 2.0% si
Cpu1: 0.0% us, 0.3% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu2: 86.8% us, 1.7% sy, 0.0% ni, 10.6% id, 0.0% wa, 0.0% hi, 1.0% si
Cpu3: 90.7% us, 2.0% sy, 0.0% ni, 6.3% id, 0.0% wa, 0.0% hi, 1.0% si
Mem: 2060996k total, 2040368k used, 20628k free, 17656k buffers
Swap: 4194296k total, 2556k used, 4191740k free, 1335640k cached
the CPU consumption is stable, around 90% all the time. So why cput
doesn't grow up? The job is running on two nodes.
Thanks for any help
More information about the torqueusers
mailing list