[torqueusers] Re: pbs_mom logging loads of Success(0) get_proc_stat

Garrick Staples garrick at usc.edu
Mon Dec 10 14:51:47 MST 2007


On Mon, Dec 10, 2007 at 09:25:52PM +0100, Michael Meier alleged:
> >>>Mom logs stuff like that:
> >>>>12/07/2007 00:04:10;0001;   pbs_mom;Svr;pbs_mom;Success (0) in 
> >>>>cput_sum, 7058: get_proc_stat
> >>>the mom tries to parse that line in the following way (from 
> >>>torque-2.3.0-snap.200712061242/src/resmom/linux/mom_mach.c):
> >>>fscanf(fd,"%d (%[^)]) %c %d %d %d
> >>>That will probably break on parsing the '(ib_fmr(mthca0))'
> >>>The only proper fix would probably be to look for the last ')' in the 
> >>>whole string.
> >>And here's my suggestion for a patch. Patchfile is against torque 2.2.1.
> >Egads!  An entirely non-backwards compatible problem.  That's another 
> >reason
> >why IB sucks!
> 
> In what way is that non-backwards-compatible? Were there ever linux 
> versions where a ')' appears in any place after the process name string? 
> Unless there were, my patch in no way alters torques behaviour - except 
> it no longer breaks when special characters appear in a process name.
> And although I don't think it's really a good idea to use brackets in 
> the name, it's still valid, you can't blame IB. It's not like the 
> drivers is doing something only a kernel driver could do. Every cluster 
> user could just name his binary 'hi (there)', run it and confuse torque 
> with it. Linux does in no way prohibit or filter spaces, '(' or ')'.

Ah, I see now.  I didn't really understand what was going on at first.

I'll test and check in your patch.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071210/c499148e/attachment.bin


More information about the torqueusers mailing list