[torqueusers] Nodes not showing correct state

Moye,Roger V RVMoye at mdanderson.org
Wed Feb 13 11:55:28 MST 2013


We are running torque 4.1.2 and maui 3.3.1.

When I run "diagnose -n" I see that all of my nodes report:
WARNING:  node 'nodename' has not been updated in 00:21:52.

I believe this is because Maui is not getting updated node status from torque.  But I'm not sure if this is a torque problem or maui problem so I'm posting to both lists.   All of the nodes are online and are responsive.  In many cases they are idle.

If I wait long enough (maybe hours) the problem will resolve itself but reappear a short while later.  If I reset maui then the problem is resolved but will reappear a short while later.

As a result of this problem we often have maui thinking that nodes are busy even while they are idle.  So jobs wait in the queue even when nodes are idle.

Has anyone seen this problem before?

Thanks!
-Roger Moye

-----------------------------------------------------------
Roger V. Moye
Systems Analyst III
University of Texas MD Anderson Cancer Center
Division of Quantitative Sciences
FCT4.6109
Houston, Texas
-----------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130213/df7b1c39/attachment.html 


More information about the torqueusers mailing list