[torqueusers] Re: pbs_mom logging loads of Success(0) get_proc_stat

Michael Meier Michael.Meier at rrze.uni-erlangen.de
Fri Dec 7 10:28:25 MST 2007


> Mom logs stuff like that:
>> 12/07/2007 00:04:10;0001;   pbs_mom;Svr;pbs_mom;Success (0) in 
>> cput_sum, 7058: get_proc_stat
> the mom tries to parse that line in the following way (from 
> torque-2.3.0-snap.200712061242/src/resmom/linux/mom_mach.c):
> fscanf(fd,"%d (%[^)]) %c %d %d %d
> That will probably break on parsing the '(ib_fmr(mthca0))', because it 
> will assume the first ')' is the closing bracket. Which is just not true.
> 'man 5 proc' suggests to use '%s', but that will be even worse than the 
> current '%[^)]', breaking on every executable name that contains a 
> space. And what if someone wants run a monster like the following:
> 6849 (te (s)( ))t)) S 25614 6849 25614 34838 6849 4194304 161 0 0 0 0 0 
> 0 0 20 0 1 0 36168980 2564096 77 18446744073709551615 4194304 4195956 
> 140736421683184 18446744073709551615 47252866936498 0 0 0 0 0 0 0 17 0 0 
> 0 0
> The only proper fix would probably be to look for the last ')' in the 
> whole string.

And here's my suggestion for a patch. Patchfile is against torque 2.2.1.
-- 
Michael Meier, HPC Services
Friedrich-Alexander-Universitaet Erlangen-Nuernberg
Regionales Rechenzentrum Erlangen
Martensstrasse 1, 91058 Erlangen, Germany
Tel.: +49 9131 85-28973, Fax: +49 9131 302941
michael.meier at rrze.uni-erlangen.de
www.rrze.uni-erlangen.de/hpc/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mom_mach_fix.patch
Type: text/x-diff
Size: 2798 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071207/a4f947b9/mom_mach_fix-0001.bin


More information about the torqueusers mailing list