[gold-users] Gold system getting bogged down

Kevin Van Workum vanw at sabalcore.com
Mon Dec 7 10:48:24 MST 2009


On Thu, Dec 3, 2009 at 5:37 PM, Hazlewood, Victor Gene <vhazlewo at utk.edu>wrote:

>  Hi!
>
>
>
> Got a gold issue…  We have a Cray XT5 system which is starting to run about
> 7000 jobs a day.  I post process the Torque accounting log each night and
> create gcharge records for posting to gold.  With 7,000 gcharge records and
> each taking approximately 5-8 seconds each it is taking 15 or more hours to
> process the accounting records.  (bad!).  The gold database and gold daemon
> are on an infrastructure server and the gcharge is done on the Cray
> directly.
>
>
>
> What can be done to the post processing process, gold daemon or the gold
> database to speed up the processing of these gcharge records? I wish these
> gcharge commands would take less than a second.  Maybe they have always
> processed in more than 1 second, maybe even more but when we had 1000-2000
> or less jobs is wasn’t really noticed.
>
>
>
> These records in Gold are being used to copy information to another
> postgres database using gold commands to collect the info and then
> subsequently these records are posted to an off site database.  With these
> three things going on (charge posting, processing charges into the other
> database, and sending the charges offsite) gold is doing quite a bit and
> seems to get bogged down with these requests.   Also gold is getting queried
> at the job submission to make sure there is time available in the account.
> All this is putting pressure on gold and it seems to be slowing
> significantly (or noticeably I guess) under the pressure.
>
>
>
> Thanks for any help you can provide.
>

I used to have this problem, but upgrading to Gold-2.1.9.0 and Pg-8.1.11
seemed to fix it. I haven't seen any gcharges take more than 1 second since
the upgrade.

You might also want to adjust the preemption model used in your kernel on
your gold server.

-Kevin

>
>
> -Victor
>
>
>
> Victor Hazlewood, CISSP
>
> Senior HPC Systems Analyst
>
> National Institute for Computational Science
>
> University of Tennessee
>
> http://www.nics.tennessee.edu/ <http://www.nics.utk.edu/>
>
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users
>
>


-- 
Kevin Van Workum, PhD
Sabalcore Computing Inc.
Run your code on 500 processors.
Sign up for a free trial account.
www.sabalcore.com
877-492-8027 ext. 11
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/gold-users/attachments/20091207/40159807/attachment.html 


More information about the gold-users mailing list