[Mauiusers] MAUI not responding - "lost connection to server"

Ben Shepler bcshepl at emory.edu
Wed Mar 11 13:57:07 MDT 2009


Dear Adrian,
I found your description of this problem in the archives.  We have  
just experienced this same problem, and were wondering if you found a  
solution?

best regards,
Ben Shepler

Gianfranco Sciacca wrote:
 > Adrian Sevcenco wrote:
 >> Greenseid, Joseph M. wrote:
 >>
 >>> it says ok for when it is starting up.  does it not actually  
start?  is
 >>> there a maui process running after you do this?
 >>>
 >> yes, it has a process but when i try to do any command related to  
maui i
 >>  have :
 >> [r... at grid01 log]# checkjob 2
 >> ERROR:    lost connection to server
 >> ERROR:    cannot request service (status)
 >> I attached the log(9) of starting maui.
 >> Can somebody see the problem there?
 >> Thank you,
 >> Adrian
 >>
 > Adrian, are you running nscd per chance? We have noticed on many  
of our
 > clients and servers that the nscd process tends to go haywire from  
time
 > to time and cause all sort of problems, including the one you  
mention.
 > The tell-tale would be nscd using 100% CPU on your grid01 machine.
 > Perhaps not your case, but worth checking.
Hi and thanks for the tip but we don't have nscd on this machine.
Adrian


 > cheers,
 > Gianfranco
 >>
 >>>
 >>> --Joe
 >>>
 >>>  
------------------------------------------------------------------------
 >>> *From:* mauiusers-boun... at supercluster.org on behalf of Adrian  
Sevcenco
 >>> *Sent:* Mon 12/15/2008 12:56 PM
 >>> *To:* mauiusers at supercluster.org
 >>> *Subject:* [Mauiusers] MAUI not responding - "lost connection to  
server"
 >>>
 >>> Hi,
 >>> I have a strange situation :
 >>> when i try to restart the maui server i have :
 >>> [r... at grid01 /]# service maui restart
 >>> Shutting down MAUI Scheduler: ERROR:    lost connection to server
 >>> ERROR:    cannot request service (status)
 >>>                                                            [FAILED]
 >>> Starting MAUI Scheduler:                                   [  OK  ]
 >>>
 >>> The same with firewall down.
 >>> as configuration i have this :
 >>>
 >>> [r... at grid01 maui]# cat maui.cfg
 >>> # MAUI configuration example
 >>>
 >>> SERVERHOST              grid01.spacescience.ro
 >>> ADMIN1                  root
 >>> ADMIN3                  edginfo rgma edguser
 >>> ADMINHOSTS              grid01.spacescience.ro
 >>> RMCFG[base]             TYPE=PBS
 >>> SERVERPORT              40559
 >>> SERVERMODE              NORMAL
 >>>
 >>> # Set PBS server polling interval. If you have short # queues or/ 
and
 >>> jobs it is worth to set a short interval. (10 seconds)
 >>>
 >>> RMPOLLINTERVAL        00:00:10
 >>>
 >>> # a max. 10 MByte log file in a logical location
 >>>
 >>> LOGFILE               /var/log/maui.log
 >>> LOGFILEMAXSIZE        10000000
 >>> LOGLEVEL              1
 >>>
 >>> # Set the delay to 1 minute before Maui tries to run a job  
again, # in
 >>> case it failed to run the first time.
 >>> # The default value is 1 hour.
 >>>
 >>> DEFERTIME       00:01:00
 >>>
 >>> # Necessary for MPI grid jobs
 >>> ENABLEMULTIREQJOBS TRUE
 >>>
 >>> Any ideas why it is not working? how can i debug this further?
 >>> is there a requirement of something to be in /etc/hosts ?
 >>> Thank you,
 >>> Adrian
 >>>
 >>>
 >>
 >>
 >>
 >>  
------------------------------------------------------------------------
 >>
 >> _______________________________________________
 >> mauiusers mailing list
 >> mauiusers at supercluster.org
 >> http://www.supercluster.org/mailman/listinfo/mauiusers
 >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20090311/e78be466/attachment-0001.html


More information about the mauiusers mailing list