[torqueusers] Problem with mlockall in resmom on aix: mom's die with out of memory condition.

Josh Butikofer josh at clusterresources.com
Tue Mar 31 09:57:19 MDT 2009


We'll take a closer look at this on our AIX machines as well to see if this fix 
should be universally adopted.

Josh Butikofer
Cluster Resources, Inc.
#############################


Michael Marti wrote:
> Dear All
> 
> pbs_mom of torque-2.3.6 breaks on aix:
> 
> In the file src/resmom/mom_main.c on line 7395 mlockall is called to 
> keep the OS from swapping resmom. It turns out that this call 
> of mlockall causes the memory consumption of the pbs_mom process to jump 
> from about 1MB to more that 250 MB.
> 
> A quick fix is to replace line 7386 which reads
>   #ifdef _POSIX_MEMLOCK
> with
>   #if defined(_POSIX_MEMLOCK) && !defined(_AIX)
> This solves the problem for us.
> 
> Would be nice if this gets fixed in a future version of torque.
> 
> Remains the question why mlockall behaves that way on aix.
> 
> uname -a on a node: AIX r1blade066 3 5 00003222D100
> 
> 
> Best regards,
> Michael Marti
> 
> -- 
> ----------------------------------------------------------------------------
> Michael Marti
> Instituto Superior Técnico
> Instituto de Plasmas e Fusão Nuclear
> Complexo Interdisciplinar
> Av. Rovisco Pais
> 1049-001 Lisboa
> Portugal
> 
> 
> Tel:       +351 218 419 379
> Fax:      +351 218 464 455
> Mobile:  +351 968 434 327
> ----------------------------------------------------------------------------
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list