[gold-users] moab-gold binding performance issue
scottmo at clusterresources.com
Thu Jun 4 17:19:42 MDT 2009
Hu, Zongjun wrote:
> Hi, Scott,
> We are running Moab 5.2 with Gold 2.1.6. I did noticed the 'Replace=True' when moab talked back to gold. I will keep watching gold balance report for next few weeks and see how it works. Probably I saw inconsistence because of the old invalid reservation are still in proposed wall clock during and has not been inactivated yet.
> Thanks for your advising for the database issue. No, I have not run 'vacummdb' for months since gold is installed. I just did it. I will double check if it is working faster. Maybe I should do a benchmark on gold job transactions to find out the exact time spent on each request.
> I have submitted a request to clusterresources to find out the possibility to reconfigure moab-gold binding settings. If it is not possible from Moab configuration, I will try your suggestion to use epilog.
> Thanks so much for your help.
You are most welcome.
Let me know if you see evidence that Gold is creating multiple job
records for the same job. If so, please provide me with a goldd.log and
the jobid of the problem job. Also the output from glsjob so I can see
the duplicate job ids. It is possible I am wrong about how this all
works. If I get this from you I can look into it in more depth.
I can see how you might run into a system if you have your system
flooded with 1 second jobs. This is just a bit outside the design specs
for Gold. If it becomes a problem, a post-processing solution using Gold
might be in order. Instead of charging instantaneously at end of job,
you could do all the debits in a batch like once a day at midnight or
something. The epilogue approach should work as well but it will cause
each job to effectively run a fraction of a second longer while it
charges. But we're probably straining at gnats here.
I would recommend the following:
Make sure your indexes are there:
pg_dump -s gold|grep INDEX
If not, create them.
Then check your timings (time gcharge, time greserve). If these are less
than half a second then you are probably OK. I would hope they would be
close to .2 seconds.
If they are low, and you just have a truly large number of short jobs,
you should probably experiment with the epilog. You may want to still
have the prolog to do the greserve as well. This will extend the job run
by a fraction of a second -- trivial in my mind. If you don't, then you
will have jobs starting that are out of funds. If you don't care about
this then you are set.
If things are really on the extreme side (many jobs finishing per
second), then I would look at post-processing the charges off-line. You
would still have to find a way to prevent negative balance jobs from
starting. This could be done by using the post-processing script to
update a file indicating which users or projects are disabled due to
lack of funds and checking this either in the prolog or a submit filter.
> Zongjun Hu
> University of Miami
> Center for Computational Science
> -----Original Message-----
> From: Scott Jackson [mailto:scottmo at clusterresources.com]
> Sent: Thursday, June 04, 2009 3:47 PM
> To: Hu, Zongjun
> Cc: 'gold-users at supercluster.org'
> Subject: Re: [gold-users] moab-gold binding performance issue
> Hu, Zongjun wrote:
>> We use Gold-2.1.6 as allocation manager for Moab-5.2.0. We are trying
>> to move this configuration to production. We got several issues and
>> need help.
>> 1. This binding is working as job reservation @ job start. Most of the
>> time, this mode works perfectly. If a job is accepted to start, a new
>> job will be created in gold and account balance will be deducted
>> according to requested cpu hours. If this job runs and finishes without
>> problem, this job will be charged finally according to real usage. The
>> previous charge in gold will be returned to account balance. However,
>> we got an interesting problem. We found some jobs are accepted to
>> start, and then moved to compute nodes. For some reason, these jobs do
>> not start successfully on compute nodes. They are then rejected and
>> moved back to blocked/waiting list. After a while, these jobs will be
>> evaluated again and repeat these steps. In this situation, multiple
>> jobs will be created in gold and account balance will be deducted
>> multiple times (to make things worse, user usually request much more
>> than they need). When these jobs are finally finished or canceled, gold
>> will only change the last job created to 'Charge' stage and put back
>> only the last deduction back to account. All previous balance
>> jobs/deductions created will stay in gold and the reserved balance
>> won't be restored. After a while, gold will have a huge amount of
>> 'Reserved balance' even all jobs are completed. Can you give us
>> instruction to fix this problem and release all those unnecessary
>> reserved balance?
> I believe that Moab should be releasing these Gold reservations when it
> discovers that they have been rejected (failed to start successfully).
> That would be the correct fix. As it is, the reservations only remain
> active within Gold for the wallclock duration of the job, then they
> automatically become inactive and no longer affect the balance. Anytime
> you need to, you can remove a reservation within gold with the grmres
> command. Also, grmres -I can be used to get rid of all of the stale
> reservations that no longer are affecting the balance. At any given
> time, if you run glsres -A, the list of reservations returned should
> pertain only to currently running jobs. Can you tell me what version of
> Moab you are running? The reason I ask is because I am surprised that
> the multiple reservations are creating multiple jobs within Gold. I
> believe it was quite a long time ago that Moab should have started using
> a new Replace=True option in the reservations to avoid creating new job
> instances in Gold. Also, please tell me the Gold version. As far as the
> reservations not being removed, I would say this is a Moab bug and would
> recommend you submit a ticket to moab-support at clusterresources.com
> explaining the problem and providing what evidence you can collect of it
> (run support.diag.pl and send resulting tarball along with goldd.log,
> any pertinent torque logs, etc).
>> 2. Sometimes, we have lot of small jobs (finish in 1 minute). Because
>> for each job, Moab has to contact gold server to reserve and then
>> charge job when it finishes. Those small jobs make moab repeat these
>> steps frequently and moab server is very busy. This slow down reponse
>> to user reqeust a lot and sometimes time out in user request. Is there
>> a way to speed up gold job processing? If not, can we configure Moab-
>> gold binding to Job charge @ job end time? Therefore we can save at
>> least half of the processing time. We did not find guidance in Moab or
>> Gold documents for this configuration.
> I don't know if that is configurable within Moab. (I think it probably
> should be but I doubt that it is). I believe that CRI would accept this
> change request and provide a Moab parameter to avoid doing the
> reservation. You would have to submit a ticket to
> moab-support at clusterresources.com to see if Moab can accomodate this
> change request. It would be possible to take it out of the hands of
> Moab entirely if you wanted to by simply writing your own epilog to call
> gcharge (and taking out the applicable AMCFG parameters out of moab.cfg).
> How much time are your reservations and charges taking? If these are
> larger than a fraction of a second (.2 or .3 sec), then yI would ask to
> see if you are VACUUMing your database freqently and see if you have the
> indexes setup in your database. Please let me know about this.
> I hope this helps,
>> Zongjun Hu
>> University of Miami
>> Center for Computational Science
>> gold-users mailing list
>> gold-users at supercluster.org
More information about the gold-users