[Mauiusers] [patch] Work around Maui freezes due to the slow responses of Torque server

Eygene Ryabinkin rea+maui at grid.kiae.ru
Mon Jun 23 08:55:22 MDT 2008


Craig,

Mon, Jun 23, 2008 at 02:00:54PM +0100, Craig Macdonald wrote:
> > 15 minutes one where Maui blocked on read()?
> >   
> Yes, absolutely. See 
> http://www.clusterresources.com/pipermail/torquedev/2007-February/000495.html
> IIRC Maui says its doing a non-blocking, but its not the case in 
> pbs_disconnect.

Ah, OK: the same situation as in my case.

> We use NIS for authentication. I didnt manage to strace pbs_server. I 
> just presumed pbs_server
> was doing some lookup. Could easily trigger by submitting about 50 jobs 
> at once.

Fair enough.

> > May be my case is not related to yours.  Will you be able to test
> > the patches?
> >   
> I'm sorry, I'm unable to test such a patch, as I dont have root access 
> on our cluster machines.

OK.  But if you or your local administrator will be able to do so,
it will be very good.

Thank you!
-- 
Eygene Ryabinkin, Russian Research Centre "Kurchatov Institute"


More information about the mauiusers mailing list