[Mauiusers] Most "stable" version of Maui

Bas van der Vlies basv at sara.nl
Fri Jan 11 10:26:56 MST 2008


Michael Barnes wrote:
> On Fri, Jan 11, 2008 at 04:57:16PM +0100, Bas van der Vlies wrote:
>> Michael Barnes wrote:
>>> Maui users,
>>>
>> Michael,
>>
>>  Try the lastest snapshot of maui (maui-3.2.6p20-snap.1182974819). If a
>> remember it correct there is a bug in maui-3.2.6p19 a patch was not applied
>> correctly and therefore you get a segv.
>>
>> I am also running the lastest snapshot without any problems.
> 
> Maybe this is a Fedora Core 7 thing.  I just compiled and installed this
> snapshot.  This is how I ran configure:
> 
> # these are the same flags that all of the FC7 RPMs use
> 
> export CFLAGS="-D__M64 -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic"
> 
> cd maui-3.2.6p20/
> 
> ./configure --prefix=/usr/local
> 
> 
> And it ran 2 jobs, and now its acting funny.
> 
> Sometimes jobs will run, sometimes not.
> 
> I also get this:
> 
> checkjob 169101.pbsold
> ERROR:    lost connection to server
> ERROR:    cannot request service (status)
> 
> 
> Same with:
> 
> showq
> ERROR:    lost connection to server
> ERROR:    cannot request service (status)
> 
> 
> I do an strace on the running maui process and I see:
> 
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
> accept(5, 0x7fff83ee6510, [9506649594159693840]) = -1 EAGAIN (Resource temporarily unavailable)
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
> accept(5, 0x7fff83ee6510, [9506649594159693840]) = -1 EAGAIN (Resource temporarily unavailable)
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
> accept(5, 0x7fff83ee6510, [9506649594159693840]) = -1 EAGAIN (Resource temporarily unavailable)
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
> accept(5, 0x7fff83ee6510, [9506649594159693840]) = -1 EAGAIN (Resource temporarily unavailable)
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
> accept(5, 0x7fff83ee6510, [9506649594159693840]) = -1 EAGAIN (Resource temporarily unavailable)
> 
> over and over again.
> 
> An strace on the client command says this many times (as root and me):
> 
> bind(6, {sa_family=AF_INET, sin_port=htons(831), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EACCES (Permission denied)
> 
> 
> I see nothing similar to the working version (meaning there is no bind()
> call).
> 
> 
> 
> I don't know what else to try besides reinstalling the OS in 32bit mode,
> which is not a big deal.  But if anybody has any suggestions, I'm open
> to them.
> 
> Another piece of information is that I am running the pbs_server in
> debug mode, but AFAIK, this only keeps it from forking and it dumps out
> some stuff on the terminal.
> 
> 
> 
> I don't know what more to try.
> 
> 
> 
> -mb
> 
> 
> 
> --
> +-----------------------------------------------
> | Michael Barnes
> |
> | Thomas Jefferson National Accelerator Facility
> | 12000 Jefferson Ave.
> | Newport News, VA 23606
> | (757) 269-7634
> +-----------------------------------------------

Did you try to set the loglevel to 9 and check the maui.log for error 
messages. All tools also the client tools (diagnose, checkjob, ...) are 
communicating with the server. So if the server (maui) crashes nothing 
works anymore.


It seems like maui closes the socket, maybe ipv6 related or something. Just 
a guess

Regards


-- 
--
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv at sara.nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************


More information about the mauiusers mailing list