[torquedev] [Bug 144] Possible memory leak in pbs_server

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Fri Oct 21 14:21:28 MDT 2011


http://www.clusterresources.com/bugzilla/show_bug.cgi?id=144

Lukasz Flis <l.flis at cyf-kr.edu.pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |l.flis at cyf-kr.edu.pl

--- Comment #2 from Lukasz Flis <l.flis at cyf-kr.edu.pl> 2011-10-21 14:21:28 MDT ---
Hi,

Any progress on this?

We heave the same problem with Torque 2.5.8 and Moab 6.1.
As a workaround we use cron script which restarts server every hour.

Arnau, what scheduler software are you using? What is the size of your cluster
(nodes/cores/average number of jobs)?

We currently have 1k nodes and around 11k cores, 5k jobs on avg, core
utilization is around 95%

I am not able to debug the issue myself on production with valgrind because it
slows down things too much. The bigger the cluster the faster problem occurs.

Best Regards
--
Lukasz Flis

-- 
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the torquedev mailing list