[gold-users] speeding up gold reservations

Scott Jackson scottmo at clusterresources.com
Thu Apr 16 13:39:05 MDT 2009


Stijn De Weirdt wrote:
>> Hi Stijn,
>>
>> Sorry to take so long to reply .... you know the story.
>>
>>     
> no problem. glad you never delete your spam ;)
>
>   
>>> i'm running a maui 3.2.6p21/torque 2.3.6 with GOLD as AM and we have
>>> some issues with users submitting large (as in 10+k) amounts of short
>>> (5-10 minutes) jobs to the queue and this has been choking up the system
>>> somewhat.
>>>
>>>   
>>>       
>> Wow. That's alot:)
>>     
> the scary part is that users think it isn't ;)
>
>   
>>> one factor in this whole process is that gold slows things down a lot. i
>>> see gold reservation requests (Successfully reserved X credits for job
>>> Y) when each job enter the maui queue (if i phrase this correctly), and
>>> one when the job is done.
>>> the first request is the most limiting one, as all new jobs in the queue
>>> are processed on entering (although i have MAXIJOB set rather low, so
>>> almost all of these jobs enter as blocked jobs anyway). each request
>>> takes approx 600-700ms, and because the jobs finish more quickly then
>>> the time needed for maui to add newly submitted jobs (not all of them,
>>> but still a lot), cluster usage is spiked. 
>>>
>>>   
>>>       
>> I would expect the gold calls to happen when the jobs are started, not 
>> when they are submitted. Some sites use a submit filter to check for a 
>> reasonable balance when the job is submitted to prevent it being held if 
>> it is later found to be out of credits, but this is entirely optional.
>>
>>     
> what i see from the logs (when using torque/maui) is that after you
> submit a job to torque, maui polls torque for yet unknown jobs. when
> these job are seen by maui (and enter what i call the maui queue), they
> are checked with gold if enough credits exist (irrespective if the job
> is considered blocked or not).
> when maui starts the job, it checks gold again (the real reservation i
> assume) and then once more at the end of the job. 
>   

Stijn,

I have not worked much with Maui. What is the Gold call that Maui is 
making at this stage (when it is enters the "Maui queue")? Can you send 
me the maui.log and goldd.log. It is quite possible that Moab works 
around this another way.

>   
>>> maui is annoying that it doesn't make these request in parallel or
>>> doesn't make them at all since these will be blocked jobs anyway. if the
>>> gold requests where made when unblocking the jobs, at least the usage
>>> would be more optimal.
>>>
>>>   
>>>       
>> I am surprised by this behavior. I am assuming you are meaning that the 
>> jobs run shortly after being unblocked and that you would prefer Maui 
>> reserve the jobs in Gold at this time instead of at submit time. I to 
>> believe that is what it should be doing.
>>     
> i could check it in the code, but guess that maui first checks gold and
> only then determines if the job should be blocked or not.
> (the most annoying situation is when you restart maui, it rediscovers
> all jobs in torque (easily taking up 1-2 hours of maui only interacting
> with gold, so no scheduling activity) and people submitting lots of
> small jobs that could already start)
>
>   
Wow -- 1-2 hours?
> do you know if moab does this too? (ie verifiying gold for jobs that
> enter but will be blocked)
>
>   
I do not believe Moab, by default, interacts with Gold at all when it 
first sees a new job. Moab does make a few initial calls to synchronize 
reservations and default accounts, etc with Gold when you first bring it 
up, but I believe these are aggregate calls and it does not take much 
time at all. I am quite certain there is not an independent Gold call 
made for each job.

OK, I just looked in the goldd.log. When Moab is restarted, it first 
checks to see if Gold is alive with a System Query (asking for the 
version number).
Then it asks (in a single query) for an update of the allocation 
balances (so it can properly reflect project balances if you want to 
track it through Moab).
Then it asks for a list of default projects (in a single query) so it 
can use this information properly when charging. That's it. Three calls 
-- no matter how many jobs are there.

Of course, for each job that is started, there will be a reservation 
call, and for each job that finishes there will be a charge call.

>>  Perhaps you can present the 
>> evidence of this behavior in the maui logs. Or perhaps I am 
>> misunderstanding your statement. 
>>     
> should be very easy to do. i'll collect the necessary logs when i have
> some more time.
>
>   
>> It is true that neither Maui nor Moab 
>> currently batch the gold requests (probably primarily due to the fact 
>> that there is no current support in Gold for batched requests). [That 
>> might not be entirely true -- Gold may support it if you were to use the 
>> perl API, I'd have to look.]
>>     
> batch request should work for jobs that were submitted by same user.
> processing all jobs in batches might result in a speedup, but grouping
> request from same user (and maybe even same walltime/number of nodes or
> other parameters) should do it (it's not that the connection itself to
> gold slows things down)
>
>   
Another option you might want to consider is that it is not necessary to 
use the Maui/Moab integration with Gold. You can turn all of this off 
and just do it through your resource manager by means of prolog and 
epilog scripts (that call greserve and gcharge respectively). This 
allows you tight control on the exact interactions that occur between 
your SRM system and Gold.

>>> but does anyone have any tips to speedup individual gold queries?
>>>
>>>   
>>>       
>> Yes, Are you already using the new indexes? We've recently introduced 
>> indexes into the Gold tables which roughly speeds things up by 10x. 
>> Also, if your database has been in use for awhile (weeks or months), you 
>> will need to VACUUM it periodically to keep the queries quick (this also 
>> can make a very large difference).
>>     
> we are using 2.1.7.1, i assume that these indexes are in there.
>
>   
Yes, these were added in 2.1.5.0


>>> i have a tip myself: there is a certain SQL query (see bottom of mail)
>>> that is executed rather slowly with the default schema (it's not cached
>>> by the DB unlike almost almost all other SELECT SQL queries from gold).
>>> (it is actually the slowest of them all, taking approx 400-500ms of teh
>>> total 600-700ms).
>>> we first had teh MySQL as DB, but we switched to postgres 8.3.6 (this
>>> gave 10-15% speedup), but i found that adding another 2 partial indexes
>>> improved this query to approx 150-200ms).
>>>
>>> CREATE INDEX g_reservation_not_deleted_start_idx ON g_reservation
>>> (g_start_time) WHERE g_deleted!='True';
>>> CREATE INDEX g_resallo_not_deleted_id_idx ON g_reservation_allocation
>>> (g_id) WHERE g_deleted!='True';
>>>
>>> ANALYZE g_reservation;
>>> ANALYZE g_reservation_allocation;
>>>
>>>   
>>>       
>> Let me know if you can pinpoint anything else that can be improved and 
>> we can either address it or put it in as a feature request.
>>     
> the current situation is as follows: we can process a typical request in
> 300-500ms. this gives 2-3 requests per second (but with the 3 steps
> described above means 1-1.5 second gold process time per completed job,
> ie max 3600 jobs/hour, which is low).
>
>   
Yep, that sounds about accurate. I would have thought the indexes and 
vacuuming would bring that down a tad, but I would not think you would 
get much better than 200ms per call with Gold (partly to do with startup 
times in the perl scripts).
> when i start gold with loglevel TRACE and i check the time spend, half
> of it is in DB access with the longest query taking at max 100ms (all
> others seem cached). 
> this also means that the other half is spend in running "perl". 
>
> speeding up the DB (more indexes, further tuning) seems unlikely to
> help. for now the only thing that can help i think is faster hardware. 
>
>   
Yea ... I agree. It would be possible to do some form of offline 
processing -- allow all jobs with allocations to start, then perform the 
batch debits at the end of the day in a post-processing step which would 
disable any accounts that have gone negative. But, of course, you lose 
alot of the touted benefits of Gold as an online bank.

Admittedly, Gold would run alot faster if it had been written in C. 
However, it just plain would not have resulted in near the level of 
capability as its current form because of the fast and easy development 
language, parsing, data structures, database independent modules, etc 
found in Perl. I did originally write it in Java but had to rewrite 
entirely in Perl because of serious limitations in the performance and 
memory handling in Java.

> we'll see how it goes.
>
>   

Best of luck,

Scott
> stijn
>
>   
>> Thanks,
>>
>> Scott
>>     
>>> hope this helps,
>>>
>>> stijn
>>>
>>> SQL Query: SELECT
>>> g_reservation_allocation.g_id,g_reservation_allocation.g_amount FROM
>>> g_reservation, g\
>>> _reservation_allocation WHERE
>>> ( g_reservation.g_id=g_reservation_allocation.g_reservation AND
>>> g_reservation.g_start_time<='1236070726' AND g_reservation.g_en\
>>> d_time>'1236070726' AND  ( g_reservation_allocation.g_id='35' OR
>>> g_reservation_allocation.g_id='20' )  ) AND g_reservation.g_deleted!
>>> ='True' AND g_reservatio\
>>> n_allocation.g_deleted!='True'
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> gold-users mailing list
>>> gold-users at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/gold-users
>>>   
>>>       



More information about the gold-users mailing list