[torqueusers] stability observations
garrick at usc.edu
Thu Apr 6 16:02:33 MDT 2006
On Thu, Apr 06, 2006 at 12:57:47PM -0700, Alexander Saydakov alleged:
> After a few months of running 2.0.0p7 and 2.0.0p8 on FreeBSD 4.10 I observed
> the following:
> 1. pbs_sched has a memory leak. Its footprint keeps growing every day,
> so after a fresh start it reaches 300M in a few days
Can you capture this in valgrind? (or whatever freebsd has)
> 2. pbs_sched has some bug in the algorithm. Quite often it picks up
> some random jobs from lower priority queues despite of a lot of jobs in
> higher priority queues.
I don't know how much support you are going to get for this. Noone is
> 3. pbs_server is unstable when some configuration changes are made.
> Strangely, but it can crash after a few minutes since a change. Not all
> changes are bad. Adding nodes and queues, or adjusting their parameters is
> fine. After deleting nodes (with patch! With no patch it died immediately),
> for instance, it died within a few hours. If you don't touch it, it runs
Can you capture this in gdb?
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060406/1fd3695b/attachment.bin
More information about the torqueusers