[Mauiusers] [patch] Work around Maui freezes due to the slow responses of Torque server

Eygene Ryabinkin rea+maui at grid.kiae.ru
Mon Jun 23 08:55:22 MDT 2008


Mon, Jun 23, 2008 at 02:00:54PM +0100, Craig Macdonald wrote:
> > 15 minutes one where Maui blocked on read()?
> >   
> Yes, absolutely. See 
> http://www.clusterresources.com/pipermail/torquedev/2007-February/000495.html
> IIRC Maui says its doing a non-blocking, but its not the case in 
> pbs_disconnect.

Ah, OK: the same situation as in my case.

> We use NIS for authentication. I didnt manage to strace pbs_server. I 
> just presumed pbs_server
> was doing some lookup. Could easily trigger by submitting about 50 jobs 
> at once.

Fair enough.

> > May be my case is not related to yours.  Will you be able to test
> > the patches?
> >   
> I'm sorry, I'm unable to test such a patch, as I dont have root access 
> on our cluster machines.

OK.  But if you or your local administrator will be able to do so,
it will be very good.

Thank you!
Eygene Ryabinkin, Russian Research Centre "Kurchatov Institute"

More information about the mauiusers mailing list