[gold-users] Can Gold be told not to make a reservation for a job without sufficient quota ?

Scott Jackson scottmo at adaptivecomputing.com
Wed Oct 27 11:33:05 MDT 2010


Chris,

I did an investigation into the problem I had encountered earlier and now I recall what it was. I am assuming you are using the MySQL database because this is where I encountered my problems. I discovered that the default MySQL storage 
engine (MyISAM) can only manage non-transaction tables. I changed my storage engine to InnoDB and that corrected the lack of transaction-safe tables. Please see some of the following links:

http://dev.mysql.com/doc/refman/5.0/en/storage-engines.html
http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html

What I did to correct this is now I start my MySQL database with:
/usr/bin/mysqld_safe --default-storage-engine=InnoDB

That makes it so all tables created will use the Transaction-safe InnoDB storage engine. I don't know if it is a run time or a creation time thing. Since you have an already existing system, I have not researched what you would need to do. You may be able to do an ALTER TABLE to change the storage engine on all of your tables.

At a maximum, you could start it up with the new flag and recreate your tables from a dump.

I hope this is getting close to helping.

Scott

----- Original Message -----
> From: "Christopher Samuel" <samuel at unimelb.edu.au>
> To: gold-users at supercluster.org
> Sent: Tuesday, October 26, 2010 6:28:35 PM
> Subject: Re: [gold-users] Can Gold be told not to make a reservation for a job without sufficient quota ?
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 27/10/10 09:58, Scott Jackson wrote:
> 
> > Huh?
> 
> That was my response when I was investigating why we were
> seeing jobs banking up for people for no apparent reason.
> 
> > Gold should either succeed or fail for the entire
> > reservation request.
> 
> That's what we were hoping for. :-)
> 
> > It should not result in a "partial" reservation.
> 
> Just to clarify here's an (invented) example of what we
> believe we are seeing.
> 
> A project has 10,000 hours left and submits a job that is
> using 8,000 hours. They then submit a job that is going to
> use 3,000 hours. That job gets a reservation of 2,000 hours
> and defers in Moab.
> 
> > Please pass on the corroborating evidence of the
> > problem and I'll see if I can comment further.
> 
> A cursory glance shows I've got 1 user with 55 reservations
> for jobs that are currently blocked by Moab. All jobs are
> 48 CPU hours (1 core for 2 days) and Gold has given one job
> 32 hours and all the other reservations are for 0 hours.
> 
> Here's a quick list of those reservations.
> 
> [root at bruce-m ~]# showq -b | fgrep aooi | awk '{print $1}' | xargs -n1
> glsres --quiet -h -n
> 366619 367727 32.00 2010-10-27 10:03:17 2010-10-29 10:13:17 368075
> aooi
> VR0018 bruce-m 489
> 366620 367728 0.00 2010-10-27 10:03:18 2010-10-29 10:13:18 368076 aooi
> VR0018 bruce-m
> 366621 367729 0.00 2010-10-27 10:03:18 2010-10-29 10:13:18 368077 aooi
> VR0018 bruce-m
> 366622 367730 0.00 2010-10-27 10:03:49 2010-10-29 10:13:49 368078 aooi
> VR0018 bruce-m
> 366030 367731 0.00 2010-10-26 12:36:35 2010-10-28 12:46:35 367479 aooi
> VR0018 bruce-m
> 366031 367732 0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367480 aooi
> VR0018 bruce-m
> 366032 367733 0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367481 aooi
> VR0018 bruce-m
> 366033 367734 0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367482 aooi
> VR0018 bruce-m
> 366034 367735 0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367483 aooi
> VR0018 bruce-m
> 366035 367736 0.00 2010-10-26 12:37:07 2010-10-28 12:47:07 367484 aooi
> VR0018 bruce-m
> 366036 367737 0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367485 aooi
> VR0018 bruce-m
> 366037 367738 0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367486 aooi
> VR0018 bruce-m
> 366038 367739 0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367487 aooi
> VR0018 bruce-m
> 366039 367740 0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367488 aooi
> VR0018 bruce-m
> 366041 367741 0.00 2010-10-26 12:37:39 2010-10-28 12:47:39 367490 aooi
> VR0018 bruce-m
> 366042 367742 0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367491 aooi
> VR0018 bruce-m
> 366043 367743 0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367492 aooi
> VR0018 bruce-m
> 366044 367744 0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367493 aooi
> VR0018 bruce-m
> 366045 367745 0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367494 aooi
> VR0018 bruce-m
> 366046 367746 0.00 2010-10-26 12:38:11 2010-10-28 12:48:11 367495 aooi
> VR0018 bruce-m
> 366047 367747 0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367496 aooi
> VR0018 bruce-m
> 366048 367748 0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367497 aooi
> VR0018 bruce-m
> 366049 367749 0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367498 aooi
> VR0018 bruce-m
> 366050 367750 0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367499 aooi
> VR0018 bruce-m
> 366051 367751 0.00 2010-10-26 12:38:43 2010-10-28 12:48:43 367500 aooi
> VR0018 bruce-m
> 366052 367752 0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367501 aooi
> VR0018 bruce-m
> 366053 367753 0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367502 aooi
> VR0018 bruce-m
> 366054 367754 0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367503 aooi
> VR0018 bruce-m
> 366055 367755 0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367504 aooi
> VR0018 bruce-m
> 366056 367756 0.00 2010-10-26 12:39:15 2010-10-28 12:49:15 367505 aooi
> VR0018 bruce-m
> 366057 367757 0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367506 aooi
> VR0018 bruce-m
> 366058 367758 0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367507 aooi
> VR0018 bruce-m
> 366059 367759 0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367508 aooi
> VR0018 bruce-m
> 366060 367760 0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367509 aooi
> VR0018 bruce-m
> 366061 367761 0.00 2010-10-26 12:39:47 2010-10-28 12:49:47 367510 aooi
> VR0018 bruce-m
> 366062 367762 0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367511 aooi
> VR0018 bruce-m
> 366063 367763 0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367512 aooi
> VR0018 bruce-m
> 366064 367764 0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367513 aooi
> VR0018 bruce-m
> 366065 367765 0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367514 aooi
> VR0018 bruce-m
> 366066 367766 0.00 2010-10-26 12:40:19 2010-10-28 12:50:19 367515 aooi
> VR0018 bruce-m
> 366067 367767 0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367516 aooi
> VR0018 bruce-m
> 366068 367768 0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367517 aooi
> VR0018 bruce-m
> 366069 367769 0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367518 aooi
> VR0018 bruce-m
> 366070 367770 0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367519 aooi
> VR0018 bruce-m
> 366071 367771 0.00 2010-10-26 12:40:51 2010-10-28 12:50:51 367520 aooi
> VR0018 bruce-m
> 366072 367772 0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367521 aooi
> VR0018 bruce-m
> 366073 367773 0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367522 aooi
> VR0018 bruce-m
> 366074 367774 0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367523 aooi
> VR0018 bruce-m
> 366075 367775 0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367524 aooi
> VR0018 bruce-m
> 366076 367776 0.00 2010-10-26 12:41:23 2010-10-28 12:51:23 367525 aooi
> VR0018 bruce-m
> 366077 367777 0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367526 aooi
> VR0018 bruce-m
> 366078 367778 0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367527 aooi
> VR0018 bruce-m
> 366079 367779 0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367528 aooi
> VR0018 bruce-m
> 366080 367780 0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367529 aooi
> VR0018 bruce-m
> 366081 367781 0.00 2010-10-26 12:41:55 2010-10-28 12:51:55 367530 aooi
> VR0018 bruce-m
> 
> This is what gbalance says for this project:
> 
> [root at bruce-m ~]# gbalance -h -p VR0018 -m bruce-m
> Id Name Amount Reserved Balance CreditLimit Available
> - --- ----------------- ------- -------- ------- ----------- ---------
> 325 VR0018 0.00 0.00 0.00 0.00 0.00
> 350 VR0018 0.00 0.00 0.00 0.00 0.00
> 489 VR0018 on bruce-m 2848.80 2848.80 0.00 0.00 0.00
> 
> Is that useful ?
> 
> cheers,
> Chris
> - --
> Christopher Samuel - Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computational Initiative
> Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.unimelb.edu.au/
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAkzHcjMACgkQO2KABBYQAh/TUgCfeBNVpwsSQYx8UpNCXwe8r3Mt
> EYcAmgKYxxnq5Au0j/oeF8F5AXL9ie/m
> =+9aR
> -----END PGP SIGNATURE-----
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users


More information about the gold-users mailing list