[torqueusers] reading the "Time Use" in qstat

bill cluster.bill at alinto.com
Wed Sep 13 03:47:19 MDT 2006


Garrick Staples a écrit :
>>
>>At some point I think torque qstat started reporting CPU time used
>>in that column, rather than job wall time used.  Maybe it always did, and
>>I'm (ahem) "misremembering" somehow. So maybe the latter job isn't
>>using much cpu time. (Some old versions of torque also had time reporting
>>bugs, but chances are you're not running one of 'em, especially since
>>one of the jobs has a real-looking figure...)
>
which versions of torque are known to have bugs like this? I compile 
from source torque-2.1.0p0.

> 'qstat' has always had CPU time.  'qstat -a' has always had walltime.
>
I think there is a problem. I saw the problem again (and this time, it's 
not a stuck job)

$ qstat
Job id              Name             User             Time Use S Queue
------------------- ---------------- ---------------- -------- - -----
1569.ml             lr-lock          simu1            00:00:00 R batch

and
qstat -a show me 1h30 of walltime. I got a lot of CPU eaten as shown by top:

top - 11:31:35 up 11 days, 19:06,  9 users,  load average: 1.78, 1.82, 1.81
Tasks: 178 total,   3 running, 175 sleeping,   0 stopped,   0 zombie
Cpu0:  0.0% us, 0.0% sy, 0.0% ni, 97.7% id, 0.0% wa, 0.3% hi,  2.0% si
Cpu1:  0.0% us, 0.3% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi,  0.0% si
Cpu2: 86.8% us, 1.7% sy, 0.0% ni, 10.6% id, 0.0% wa, 0.0% hi,  1.0% si
Cpu3: 90.7% us, 2.0% sy, 0.0% ni,  6.3% id, 0.0% wa, 0.0% hi,  1.0% si
Mem:   2060996k total,  2040368k used,    20628k free,    17656k buffers
Swap:  4194296k total,     2556k used,  4191740k free,  1335640k cached

the CPU consumption is stable, around 90% all the time. So why cput 
doesn't grow up? The job is running on two nodes.

Thanks for any help


More information about the torqueusers mailing list