[torqueusers] Re: pbs_mom logging loads
of Success(0) get_proc_stat
Garrick Staples
garrick at usc.edu
Mon Dec 10 14:51:47 MST 2007
On Mon, Dec 10, 2007 at 09:25:52PM +0100, Michael Meier alleged:
> >>>Mom logs stuff like that:
> >>>>12/07/2007 00:04:10;0001; pbs_mom;Svr;pbs_mom;Success (0) in
> >>>>cput_sum, 7058: get_proc_stat
> >>>the mom tries to parse that line in the following way (from
> >>>torque-2.3.0-snap.200712061242/src/resmom/linux/mom_mach.c):
> >>>fscanf(fd,"%d (%[^)]) %c %d %d %d
> >>>That will probably break on parsing the '(ib_fmr(mthca0))'
> >>>The only proper fix would probably be to look for the last ')' in the
> >>>whole string.
> >>And here's my suggestion for a patch. Patchfile is against torque 2.2.1.
> >Egads! An entirely non-backwards compatible problem. That's another
> >reason
> >why IB sucks!
>
> In what way is that non-backwards-compatible? Were there ever linux
> versions where a ')' appears in any place after the process name string?
> Unless there were, my patch in no way alters torques behaviour - except
> it no longer breaks when special characters appear in a process name.
> And although I don't think it's really a good idea to use brackets in
> the name, it's still valid, you can't blame IB. It's not like the
> drivers is doing something only a kernel driver could do. Every cluster
> user could just name his binary 'hi (there)', run it and confuse torque
> with it. Linux does in no way prohibit or filter spaces, '(' or ')'.
Ah, I see now. I didn't really understand what was going on at first.
I'll test and check in your patch.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071210/c499148e/attachment.bin
More information about the torqueusers
mailing list