[torqueusers] momctl -d 0 output
charles.johnson at accre.vanderbilt.edu
Wed Mar 5 14:59:28 MST 2014
Most of our jobs are single node/single core jobs. From time to time I
find nodes that have output similar to this:
root at vmp1010:~# momctl -d 0
Host: vmp1010/vmp1010 Version: 4.2.4 PID: 7664
Server: vmpsched (10.0.0.3:15001)
Last Msg From Server: 21 seconds (Commit)
Last Msg To Server: 10528 seconds
MOM active: 10499 seconds
LogLevel: 7 (use SIGUSR1/SIGUSR2 to adjust)
Sometime the message times between mom and server will get very large,
and no jobs are appearing on the node.
Is this unusual. Is it "normal" for the mom to have such infrequent
communication with the server? The communication times for Torque are
all the defaults of either 300 seconds, or 600 seconds.
Any insight would be appreciated.
Charles N. Johnson
Advanced Computing Center for Research and Education
More information about the torqueusers