[torquedev] [Bug 144] Possible memory leak in pbs_server
bugzilla-daemon at supercluster.org
bugzilla-daemon at supercluster.org
Fri Oct 21 14:21:28 MDT 2011
http://www.clusterresources.com/bugzilla/show_bug.cgi?id=144
Lukasz Flis <l.flis at cyf-kr.edu.pl> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |l.flis at cyf-kr.edu.pl
--- Comment #2 from Lukasz Flis <l.flis at cyf-kr.edu.pl> 2011-10-21 14:21:28 MDT ---
Hi,
Any progress on this?
We heave the same problem with Torque 2.5.8 and Moab 6.1.
As a workaround we use cron script which restarts server every hour.
Arnau, what scheduler software are you using? What is the size of your cluster
(nodes/cores/average number of jobs)?
We currently have 1k nodes and around 11k cores, 5k jobs on avg, core
utilization is around 95%
I am not able to debug the issue myself on production with valgrind because it
slows down things too much. The bigger the cluster the faster problem occurs.
Best Regards
--
Lukasz Flis
--
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the torquedev
mailing list