[torqueusers] momctl -d 0 output

Charles Johnson charles.johnson at accre.vanderbilt.edu
Wed Mar 5 14:59:28 MST 2014


Most of our jobs are single node/single core jobs. From time to time I 
find nodes that have output similar to this:

root at vmp1010:~# momctl -d 0

Host: vmp1010/vmp1010   Version: 4.2.4   PID: 7664
Server[0]: vmpsched (10.0.0.3:15001)
   Last Msg From Server:   21 seconds (Commit)
   Last Msg To Server:     10528 seconds
HomeDirectory:          /usr/spool/PBS/mom_priv
MOM active:             10499 seconds
LogLevel:               7 (use SIGUSR1/SIGUSR2 to adjust)

Sometime the message times between mom and server will get very large, 
and no jobs are appearing on the node.

Is this unusual. Is it "normal" for the mom to have such infrequent 
communication with the server? The communication times for Torque are 
all the defaults of either 300 seconds, or 600 seconds.

Any insight would be appreciated.

~Charles~

-- 
Charles N. Johnson
Advanced Computing Center for Research and Education
Vanderbilt University
Office: 615-343-4134
Cell: 615-478-7788



More information about the torqueusers mailing list