[gold-users] Improved gold performance

Brock Palen brockp at umich.edu
Thu Nov 6 15:43:19 MST 2008


Those who don't want to read the why, add the following index:

create index  g_reservation_acct_where_idx ON  
g_reservation_allocation (g_account) WHERE g_deleted!='True';

We found gold to be a bottle neck as our cluster grew.  Also when we  
added preemption, many users started running large numbers of very  
small jobs,  marking off all of htese jobs in gold quickly became a  
problem causing moab to slow drastically.

So, I jumped back into the gold database.  Note we are still using a  
very old version of gold. I turned on:
log_min_duration_statement=50     # logs queries that take longer  
than 50 ms.

The queries that took the most time were of the form:
SELECT  
g_reservation_allocation.g_id,g_reservation_allocation.g_amount FROM  
g_reservation, g_reservation_allocation WHERE  
( g_reservation.g_id=g_reservation_allocation.g_reservation AND  
g_reservation.g_start_time<='1226009628' AND  
g_reservation.g_end_time>'1226009628' AND  
g_reservation_allocation.g_account='2' ) AND g_reservation.g_deleted! 
='True' AND g_reservation_allocation.g_deleted!='True';

This query took 1660 to 1900ms to run,
No regular index would help, turns out though that postgre lets  
indexes apply with a condition, the condition that mattered, was an  
index that had all the values of g_account _but knew_ that g_deleted! 
=True,

The following index turned the above query into a 30ms query,

create index  g_reservation_acct_where_idx ON  
g_reservation_allocation (g_account) WHERE g_deleted!='True';

EXPLAIN ANALYZE
Showed the query to now use the index. On top of that the index is  
very small, because the number of =! deleted is very small vs deleted  
which postgres used to look though,

select count(*) from g_reservation_allocation where g_deleted='True';
1,634,572

While !='True' (the data we want) is:

1,857

So the index is very small and has the data gold needs.

This index had the side effect of speeding up other queries,
The only things in my logs now that take longer than 50ms are commits  
(don't think I can do much without upgrading hardware for those), and  
some big stats queries we run on all the data.



Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985





More information about the gold-users mailing list