[torqueusers] pbs_mom process owned by non-root user
Rudge, Chris M. (Dr.)
cmr9 at leicester.ac.uk
Wed Oct 6 04:16:13 MDT 2010
We're seeing a strange problem with our cluster where nodes are marked off line and, on further investigation, we see that the pbs_mom process has become owned by a normal user. The user has run a job on the node which, as far as we can see, does nothing strange - just copies files to $TMPDIR, runs a perl script and copies output back from $TMPDIR to user's home directory. There's nothing in the perl script itself which looks odd either.
It's not a new pbs_mom process started by the user (which fails correctly if tried) rather a process started by init which has then had its ownership changed. This is only happening for one specific user and does happen on multiple nodes but we can see no obvious cause.
The version of Torque is 2.5.2.
We'd appreciate any suggestions on the likely cause of this especially if anyone else has seen similar behaviour.
Dr Chris Rudge - Research Computing Services Manager
IT Services, University of Leicester, LE1 7RH
Tel: +44 (0)116 2522223
email: chris.rudge at leicester.ac.uk
More information about the torqueusers