[torqueusers] Multiple moms

David Singleton David.Singleton at anu.edu.au
Thu May 22 15:09:32 MDT 2008


Glen Beane wrote:
> On Thu, May 22, 2008 at 10:17 AM, Charles Johnson <
> charles.johnson at accre.vanderbilt.edu> wrote:
> 
>> We use nagios to monitor an array of situations on our cluster. We have had
>> an oddity show up. We monitor the number of pbs_mom's running on a given
>> node. Nagios was set to report more than one mom running on a given node. We
>> have occasionally seen as many as three. Moreover, a few of the mom's have
>> user uid's rather than root, even though only root can start a mom. We have
>> altered nagios to ignore multiple mom's less than 5.
>>
>> Does anyone have an explanation, or better yet point me to appropriate
>> documentation.
> 
> 
> I can't point you to any documentation, but this is normal behavior.  In
> several cases the mom will fork a child process to do some task that may
> take a while to complete so the parent mom can remain responsive.  The moms
> that fork to the users uid are usually copying output files back to the user
> home directory.
> 

And I think you will see a couple of extra moms for each qsub -I
job but, in this case, they are owned by root.



More information about the torqueusers mailing list