[torqueusers] Re: why my job's resources_used.cput always 00:00:00
luo yi
luoyi829 at hotmail.com
Tue Dec 19 02:53:04 MST 2006
>
>Message: 1
>Date: Mon, 18 Dec 2006 15:22:08 +0000
>From: "luo yi" <luoyi829 at hotmail.com>
>Subject: [torqueusers] why my job's resources_used.cput always
> 00:00:00
>To: torqueusers at supercluster.org, torquedev at supercluster.org
>Message-ID: <BAY129-F180440C399F03DC30246D29AC90 at phx.gbl>
>Content-Type: text/plain; charset=gb2312; format=flowed
>
>Is there a bug in accounting the resources of a job? no matter how big
>the job is ,the resources_used.cput of the job always 00:00:00,and at the
>same time the accounting of resources_used.walltime is pretty good. this
>problem appears when i run Torque at a cluster which platform is Redhat AS
>4.0 and nodes is 2-way dual core xeon machine. but when i run Torque at
a
>cluster with Redhat 9.0 and 2-way sigal core xeon machine,it can
accounting
>the resources_used.cput correctly. why this problem appears, how can i
>resolve it ????
>
>Message: 2
>Date: Mon, 18 Dec 2006 11:18:35 -0500
>From: Tim Miller <btmiller at helix.nih.gov>
>Subject: Re: [torqueusers] why my job's resources_used.cput always
> 00:00:00
>To: torqueusers at supercluster.org
>Message-ID: <4586BF5B.6080900 at helix.nih.gov>
>Content-Type: text/plain; charset=GB2312
>
>luo yi wrote:
> > Is there a bug in accounting the resources of a job? no matter how
big
> > the job is ,the resources_used.cput of the job always 00:00:00,and at
> > the same time the accounting of resources_used.walltime is pretty good.
> > this problem appears when i run Torque at a cluster which platform is
> > Redhat AS 4.0 and nodes is 2-way dual core xeon machine. but when i
> > run Torque at a cluster with Redhat 9.0 and 2-way sigal core xeon
> > machine,it can accounting the resources_used.cput correctly. why this
> > problem appears, how can i resolve it ????
>
>Are you starting the jobs with mpirun? Torque can't do CPU time
>accounting for mpirun started jobs. The solution is to use mpiexec or
>just use the walltime (which is what I do).
>
>Cheers,
>Tim
>
YES,i use mpirun to start the jobs.
but why at the cluster with Redhat AS 4.0 and 2-way sigal core xeon
machine,torque can account well,i also use mpirun to start the jobs.
12/15/2006 19:04:29;E;232.console;user=lx group=lx jobname=dfdf queue=dpool
ctime=1166180645 qtime=1166180645 etime=1166180645 start=1166180647
exec_host=c1501/1+c1501/0+c1502/1+c1502/0+c1503/1+c1503/0+c1504/1+c1504/0
Resource_List.neednodes=4:ppn=2 Resource_List.nodect=4
Resource_List.nodes=4:ppn=2 session=0 end=1166180669 Exit_status=0
resources_used.cput=00:00:21 resources_used.mem=4544kb
resources_used.vmem=20780kb resources_used.walltime=00:00:22
you can see the cput is not zero.
but when i run torque at the cluster with Redhat AS 4.0 and 2-way dual core
xeon machine,the problem appears.
this is the environment of hardware and software
cluster with Redhat AS 4.0 and 2-way signal core cpu machine
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 10
cpu MHz : 1000.208
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 mmx fxsr sse
bogomips : 1970.17
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 10
cpu MHz : 1000.208
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 mmx fxsr sse
bogomips : 1994.75
uname -a:Linux console 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:30:39 EST 2005
i686 i686 i386 GNU/Linux
cluster with Redhat AS 4.0 and 2-way dual core xeon machine
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 3
siblings : 2
core id : 6
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3195.75
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 0
search hit TOP, continuing at BOTTOM
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3191.46
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 3
[root at ljrstest1 accounting]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 3
siblings : 2
core id : 6
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3195.75
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3191.45
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3191.46
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
stepping : 6
cpu MHz : 1595.987
cache size : 4096 KB
physical id : 3
siblings : 2
core id : 7
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm
pni monitor ds_cpl tm2 cx16 xtpr
bogomips : 3191.42
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
uname -a:Linux ljrstest1 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006
x86_64 x86_64 x86_64 GNU/Linux
torque version is 2.0.0.p8
_________________________________________________________________
ÏíÓÃÊÀ½çÉÏ×î´óµÄµç×ÓÓʼþϵͳ¡ª MSN Hotmail¡£ http://www.hotmail.com
More information about the torqueusers
mailing list