[torquedev] [Bug 144] Possible memory leak in pbs_server

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Fri Oct 21 14:21:28 MDT 2011


Lukasz Flis <l.flis at cyf-kr.edu.pl> changed:

           What    |Removed                     |Added
                 CC|                            |l.flis at cyf-kr.edu.pl

--- Comment #2 from Lukasz Flis <l.flis at cyf-kr.edu.pl> 2011-10-21 14:21:28 MDT ---

Any progress on this?

We heave the same problem with Torque 2.5.8 and Moab 6.1.
As a workaround we use cron script which restarts server every hour.

Arnau, what scheduler software are you using? What is the size of your cluster
(nodes/cores/average number of jobs)?

We currently have 1k nodes and around 11k cores, 5k jobs on avg, core
utilization is around 95%

I am not able to debug the issue myself on production with valgrind because it
slows down things too much. The bigger the cluster the faster problem occurs.

Best Regards
Lukasz Flis

Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the torquedev mailing list