[gold-users] Gold system getting bogged down
Kevin Van Workum
vanw at sabalcore.com
Mon Dec 7 10:48:24 MST 2009
On Thu, Dec 3, 2009 at 5:37 PM, Hazlewood, Victor Gene <vhazlewo at utk.edu>wrote:
> Got a gold issue… We have a Cray XT5 system which is starting to run about
> 7000 jobs a day. I post process the Torque accounting log each night and
> create gcharge records for posting to gold. With 7,000 gcharge records and
> each taking approximately 5-8 seconds each it is taking 15 or more hours to
> process the accounting records. (bad!). The gold database and gold daemon
> are on an infrastructure server and the gcharge is done on the Cray
> What can be done to the post processing process, gold daemon or the gold
> database to speed up the processing of these gcharge records? I wish these
> gcharge commands would take less than a second. Maybe they have always
> processed in more than 1 second, maybe even more but when we had 1000-2000
> or less jobs is wasn’t really noticed.
> These records in Gold are being used to copy information to another
> postgres database using gold commands to collect the info and then
> subsequently these records are posted to an off site database. With these
> three things going on (charge posting, processing charges into the other
> database, and sending the charges offsite) gold is doing quite a bit and
> seems to get bogged down with these requests. Also gold is getting queried
> at the job submission to make sure there is time available in the account.
> All this is putting pressure on gold and it seems to be slowing
> significantly (or noticeably I guess) under the pressure.
> Thanks for any help you can provide.
I used to have this problem, but upgrading to Gold-220.127.116.11 and Pg-8.1.11
seemed to fix it. I haven't seen any gcharges take more than 1 second since
You might also want to adjust the preemption model used in your kernel on
your gold server.
> Victor Hazlewood, CISSP
> Senior HPC Systems Analyst
> National Institute for Computational Science
> University of Tennessee
> http://www.nics.tennessee.edu/ <http://www.nics.utk.edu/>
> gold-users mailing list
> gold-users at supercluster.org
Kevin Van Workum, PhD
Sabalcore Computing Inc.
Run your code on 500 processors.
Sign up for a free trial account.
877-492-8027 ext. 11
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gold-users