[gold-users] gold fails to remove reservations
Scott Jackson
scottmo at adaptivecomputing.com
Thu May 24 11:03:59 MDT 2012
Eva,
On Wed, May 23, 2012 at 2:01 PM, Eva Hocks <hocks at sdsc.edu> wrote:
>
> using maui3.2.6p21-1/gold 2.2.0.1-1: a job requesting 10 hours on 96
> processors
> doesn't start due to cannot debit account but it did indeed have sufficient
> credit.
>
> $ showq|grep 964040
> 964040 xyz Deferred 96 10:00:00 Wed May 23 10:55:02
>
> job asking for 96 processors for 10 hours * 1 = 960
>
First, If this is a 10 hour job, I am little confused as to why maui is not
charging 96*10*3600. As far as I knew, maui charges by second and not by
hour.
What does a checkjob -v on the job say is the reason for the deferral?
> $ gbalance -u fpaolo
> Id Name Amount Reserved Balance CreditLimit Available
> --- ------ ------ -------- ------- ----------- ---------
> 921 xyz 1965 1920 45 0 45
>
>
>
> looking at the gold database why does one job 964040 have 2 reservations ?
>
> $ goldsh Reservation Query Name==964040
> Id Name Job User Project Machine StartTime EndTime
> Description
> ------ ------ ------- ------ ------- ------- -------------------
> ------------------- -----------
> 953044 964040 1011912 fpaolo fpaolo Triton 2012-05-23 10:55:03
> 2012-05-23 21:05:03
> 953239 964040 1012109 fpaolo fpaolo Triton 2012-05-23 12:36:33
> 2012-05-23 22:46:33
>
>
I am not sure, but it would be a problem in maui for leaving the
reservations around (or a problem in the prolog and epilog scripts since I
don't see how maui is going to be charging 960 for the job). One
improvement that we made in Moab was in linking the jobs up better. When
Maui was written, if a job tried to start multiple times, it could easily
wind up creating multiple reservations. Later we added some flags to Gold
to better address this situation (Moab calls Job Reserve with the
Replace:=True flag which deletes any reservations of the same name before
creating a new reservation). I suspect that either this is the problem (and
you are using an edited version of maui for the hours thing) or that you
are using prologs and epilogs which do not call the Replace option.
Scott
>
> Id Object Action Actor Name Child JobId Amount Delta Account
> Project User Machine Allocation Count Description Details
>
>
> CreationTime
> ModificationTime Deleted RequestId TransactionId ------- ------ -------
> ----- ---- ----- ------ ------ ----- ------- ------- ------ -------
> ---------- ----- ----------- ---
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> ------------------- ------------------- ------- --------- -------------
> 9462919 Job Reserve maui 964040 960 fpaolo
> fpaolo Triton 1
> WallDuration=36000,Processors=96,QualityOfService=DEFAULT,Queue=batch,Stage=Reserve,Charge=0,CallType=Normal,ItemizedCharges:=(
> ( ( 96 [Processors] * 0.00027777 [ChargeRate{VBR}{Processors}] ) ) * 36000
> [WallDuration] ) = 960 2012-05-23 10:55:06 2012-05-23 10:55:06 False
> 2901179 9462919
> 9465533 Job Reserve maui 964040 960 fpaolo
> fpaolo Triton 1
> WallDuration=36000,Processors=96,QualityOfService=DEFAULT,Queue=batch,Stage=Reserve,Charge=0,CallType=Normal,ItemizedCharges:=(
> ( ( 96 [Processors] * 0.00027777 [ChargeRate{VBR}{Processors}] ) ) * 36000
> [WallDuration] ) = 960 2012-05-23 12:36:44 2012-05-23 12:36:44 False
> 2901826 9465533
>
> It does not delete those correctly thus any further try at statring fails
> as
> well. Any idea why the reservation request is not removed? It does not
> happen
> with other jobs.
>
> Thanks
> Eva
>
>
>
>
>
>
>
>
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/gold-users/attachments/20120524/7933d189/attachment.html
More information about the gold-users
mailing list