[gold-users] Can Gold be told not to make a reservation for a job without sufficient quota ?

Christopher Samuel samuel at unimelb.edu.au
Tue Oct 26 18:28:35 MDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 27/10/10 09:58, Scott Jackson wrote:

> Huh?

That was my response when I was investigating why we were
seeing jobs banking up for people for no apparent reason.

> Gold should either succeed or fail for the entire
> reservation request.

That's what we were hoping for. :-)

> It should not result in a "partial" reservation.

Just to clarify here's an (invented) example of what we
believe we are seeing.

A project has 10,000 hours left and submits a job that is
using 8,000 hours. They then submit a job that is going to
use 3,000 hours.  That job gets a reservation of 2,000 hours
and defers in Moab.

> Please pass on the corroborating evidence of the
> problem and I'll see if I can comment further.

A cursory glance shows I've got 1 user with 55 reservations
for jobs that are currently blocked by Moab.  All jobs are
48 CPU hours (1 core for 2 days) and Gold has given one job
32 hours and all the other reservations are for 0 hours.

Here's a quick list of those reservations.

[root at bruce-m ~]# showq -b | fgrep aooi | awk '{print $1}' | xargs -n1
glsres --quiet -h -n
366619 367727  32.00 2010-10-27 10:03:17 2010-10-29 10:13:17 368075 aooi
VR0018  bruce-m 489
366620 367728   0.00 2010-10-27 10:03:18 2010-10-29 10:13:18 368076 aooi
VR0018  bruce-m
366621 367729   0.00 2010-10-27 10:03:18 2010-10-29 10:13:18 368077 aooi
VR0018  bruce-m
366622 367730   0.00 2010-10-27 10:03:49 2010-10-29 10:13:49 368078 aooi
VR0018  bruce-m
366030 367731   0.00 2010-10-26 12:36:35 2010-10-28 12:46:35 367479 aooi
VR0018  bruce-m
366031 367732   0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367480 aooi
VR0018  bruce-m
366032 367733   0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367481 aooi
VR0018  bruce-m
366033 367734   0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367482 aooi
VR0018  bruce-m
366034 367735   0.00 2010-10-26 12:36:36 2010-10-28 12:46:36 367483 aooi
VR0018  bruce-m
366035 367736   0.00 2010-10-26 12:37:07 2010-10-28 12:47:07 367484 aooi
VR0018  bruce-m
366036 367737   0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367485 aooi
VR0018  bruce-m
366037 367738   0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367486 aooi
VR0018  bruce-m
366038 367739   0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367487 aooi
VR0018  bruce-m
366039 367740   0.00 2010-10-26 12:37:08 2010-10-28 12:47:08 367488 aooi
VR0018  bruce-m
366041 367741   0.00 2010-10-26 12:37:39 2010-10-28 12:47:39 367490 aooi
VR0018  bruce-m
366042 367742   0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367491 aooi
VR0018  bruce-m
366043 367743   0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367492 aooi
VR0018  bruce-m
366044 367744   0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367493 aooi
VR0018  bruce-m
366045 367745   0.00 2010-10-26 12:37:40 2010-10-28 12:47:40 367494 aooi
VR0018  bruce-m
366046 367746   0.00 2010-10-26 12:38:11 2010-10-28 12:48:11 367495 aooi
VR0018  bruce-m
366047 367747   0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367496 aooi
VR0018  bruce-m
366048 367748   0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367497 aooi
VR0018  bruce-m
366049 367749   0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367498 aooi
VR0018  bruce-m
366050 367750   0.00 2010-10-26 12:38:12 2010-10-28 12:48:12 367499 aooi
VR0018  bruce-m
366051 367751   0.00 2010-10-26 12:38:43 2010-10-28 12:48:43 367500 aooi
VR0018  bruce-m
366052 367752   0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367501 aooi
VR0018  bruce-m
366053 367753   0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367502 aooi
VR0018  bruce-m
366054 367754   0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367503 aooi
VR0018  bruce-m
366055 367755   0.00 2010-10-26 12:38:44 2010-10-28 12:48:44 367504 aooi
VR0018  bruce-m
366056 367756   0.00 2010-10-26 12:39:15 2010-10-28 12:49:15 367505 aooi
VR0018  bruce-m
366057 367757   0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367506 aooi
VR0018  bruce-m
366058 367758   0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367507 aooi
VR0018  bruce-m
366059 367759   0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367508 aooi
VR0018  bruce-m
366060 367760   0.00 2010-10-26 12:39:16 2010-10-28 12:49:16 367509 aooi
VR0018  bruce-m
366061 367761   0.00 2010-10-26 12:39:47 2010-10-28 12:49:47 367510 aooi
VR0018  bruce-m
366062 367762   0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367511 aooi
VR0018  bruce-m
366063 367763   0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367512 aooi
VR0018  bruce-m
366064 367764   0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367513 aooi
VR0018  bruce-m
366065 367765   0.00 2010-10-26 12:39:48 2010-10-28 12:49:48 367514 aooi
VR0018  bruce-m
366066 367766   0.00 2010-10-26 12:40:19 2010-10-28 12:50:19 367515 aooi
VR0018  bruce-m
366067 367767   0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367516 aooi
VR0018  bruce-m
366068 367768   0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367517 aooi
VR0018  bruce-m
366069 367769   0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367518 aooi
VR0018  bruce-m
366070 367770   0.00 2010-10-26 12:40:20 2010-10-28 12:50:20 367519 aooi
VR0018  bruce-m
366071 367771   0.00 2010-10-26 12:40:51 2010-10-28 12:50:51 367520 aooi
VR0018  bruce-m
366072 367772   0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367521 aooi
VR0018  bruce-m
366073 367773   0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367522 aooi
VR0018  bruce-m
366074 367774   0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367523 aooi
VR0018  bruce-m
366075 367775   0.00 2010-10-26 12:40:52 2010-10-28 12:50:52 367524 aooi
VR0018  bruce-m
366076 367776   0.00 2010-10-26 12:41:23 2010-10-28 12:51:23 367525 aooi
VR0018  bruce-m
366077 367777   0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367526 aooi
VR0018  bruce-m
366078 367778   0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367527 aooi
VR0018  bruce-m
366079 367779   0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367528 aooi
VR0018  bruce-m
366080 367780   0.00 2010-10-26 12:41:24 2010-10-28 12:51:24 367529 aooi
VR0018  bruce-m
366081 367781   0.00 2010-10-26 12:41:55 2010-10-28 12:51:55 367530 aooi
VR0018  bruce-m

This is what gbalance says for this project:

[root at bruce-m ~]# gbalance -h -p VR0018 -m bruce-m
Id  Name              Amount  Reserved Balance CreditLimit Available
- --- ----------------- ------- -------- ------- ----------- ---------
325 VR0018               0.00     0.00    0.00        0.00      0.00
350 VR0018               0.00     0.00    0.00        0.00      0.00
489 VR0018 on bruce-m 2848.80  2848.80    0.00        0.00      0.00

Is that useful ?

cheers,
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzHcjMACgkQO2KABBYQAh/TUgCfeBNVpwsSQYx8UpNCXwe8r3Mt
EYcAmgKYxxnq5Au0j/oeF8F5AXL9ie/m
=+9aR
-----END PGP SIGNATURE-----


More information about the gold-users mailing list