[gold-users] Gold performance issues
Brock Palen
brockp at umich.edu
Tue Aug 15 08:19:54 MDT 2006
We have gold managing stats for 4 clusters. Three are running maui
and one is running moab. We currently have a large default account
and we build our statistics from gold. We plan in the future to use
gold for enforcing actual allocations.
We have been for the last few weeks been seeing postgress pegging the
cpu on the system it is running on. The postgres install is just for
gold. The cpu is not peged all the time, but postmaster racks up a
good few seconds of cpu time for each thread. I am no postgres
master but the database is held on a raid. And io wait is almost non
existent. Moab likes to wait on gold and we get messages like:
08/15 09:45:30 ERROR: cannot receive response from allocation-
manager server 'cac-admin02.engin.umich.edu':7112
08/15 09:45:30 ALERT: cannot reserve allocation for job 8990 -
cannot read message header
Many times in the moab logs. Is this because of having a single
large default account that gold is asking postgres to go though all
the transactions ever done on that default account? (all jobs for
the last year)
Some insight or postgres tuning pointers is appreciated. We hope to
add many more nodes (and jobs) to the same gold install in the future
and were disappointed to see it slow down so much.
side note, we have vacuumed the DB a few time over its life.
Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
More information about the gold-users
mailing list