[torqueusers] Nodes not showing correct state

Winfried Lorenzen winfried.lorenzen at uni-rostock.de
Fri Feb 15 08:56:42 MST 2013

I have seen the same behavior, I have not found a solution yet. Our maui 
logfiles show

ERROR:    cannot get node info: End of File
ALERT:    cannot load cluster resources on RM (RM '...' failed in function 
WARNING:  no resources detected

(maybe you have to increase the debuglevel to see that)

We have tried torque 4.1.2 to 4.1.4 with the same result.

W. Lorenzen

Am Mittwoch, 13. Februar 2013, 18:55:28 schrieb Moye, Roger V:

We are running torque 4.1.2 and maui 3.3.1.
When I run “diagnose –n” I see that all of my nodes report:
WARNING:  node ‘nodename’ has not been updated in 00:21:52.
I believe this is because Maui is not getting updated node status from 
torque.  But I’m not sure if this is a torque problem or maui problem so I’m 
posting to both lists.   All of the nodes are online and are responsive.  In 
many cases they are idle.  
If I wait long enough (maybe hours) the problem will resolve itself but 
reappear a short while later.  If I reset maui then the problem is resolved 
but will reappear a short while later.
As a result of this problem we often have maui thinking that nodes are busy 
even while they are idle.  So jobs wait in the queue even when nodes are idle.
Has anyone seen this problem before?
-Roger Moye
Roger V. Moye
Systems Analyst III
University of Texas MD Anderson Cancer Center
Division of Quantitative Sciences
Houston, Texas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130215/83736519/attachment.html 

More information about the torqueusers mailing list