[gold-users] importing old job data into gold

Scott Jackson scottmo at adaptivecomputing.com
Tue Oct 5 16:48:57 MDT 2010

Hi Steve,

There are two ways that you might think about "importing" job usage data.
With gmkjob, you will create the job record but it will not create any kind of charge or debit any accounts.
With gcharge, it will both create the job record and make a debit in the present to your accounts.
There is not a way to make a debit in the past.
Both methods can result in a job record that can show the start time and endtime of the job in the past.
Both methods will result in a job record creation time and modification time in the present.
With that in mind, now I can answer your email unambiguously.

----- Original Message -----
> From: "Steve Crusan" <scrusan at ur.rochester.edu>
> To: "Gold Users Mailing List" <gold-users at supercluster.org>
> Sent: Tuesday, October 5, 2010 2:39:08 PM
> Subject: Re: [gold-users] importing old job data into gold
> Thanks for the response Scott. Gold is essentially a bank.
> I guess I need to clarify my statement a bit more then. I don't care
> about
> applying old charges to people's balances, but more about starting at
> a
> baseline with our old Maui logs. At this point I wouldn't care if
> after
> running a massive script if every user/project/etc was millions of
> hours in
> debt. 

This, of course, would not happen if you only use gmkjob.

> I just want the historical usage numbers, with a good way to
> query
> them by time periods. 

After importing the job data with gmkjob, you could query them according to time periods based on the imported start time and end times.

> Basically, my place would be to actually run
> gold on
> our cluster AFTER we do an import, change the balances, etc...

> We're not at the point yet where we want to start using Gold as a
> charging/CPU hours banking system, but more internally so we have easy
> to
> query numbers. I see an evolution coming soon, but let's not get ahead
> of
> ourselves.

So, it sounds like you don't care right now about debits, and account balances. It appears you might only care about job data like how many processors a job ran on for how long, when it started and ended, qos, etc.

> Let's say I create a fresh instance of Gold, import any
> users/groups/setting/etc, but do not have any job data. Now, just
> importing
> the Maui logs via a script would produce the behavior you specified
> before,
> (i.e job records get debited that day).

Unless you used gmkjob -- which would not debit at all.

> Would there be any way of
> changing
> the times in the gold database itself to reflect the actual debit
> times?

I thought you weren't interested in balances. So, maybe you want to track "balances" but just keep funding unlimited?
You can change the start time and end time for the job. You can change the amount that the job record says was charged.
You can change the current account balances. But are you proposing to change the journaled entries for every intermediate state of the accounts to reflect the "time-quake" of the changes from past all the way to the future? (Have you seen Millenium?:)

> I
> guess it would be tricking the system if I changed all of the
> references in
> the right place, but would it be theoretically possible to change the
> values
> in the DB, so that some of the historical gold commands would reflect
> the
> actual job times?

Yes, absolutely. Totally possible and extremely difficult, unless I misunderstand what you intend to do.
When the bank attempts to do a statement to ask what the balance was for an account for a month in the past, there is no current table it can go to to get this. It has to go look at the journals for the account table values in previous times. The value of an account balance might be 1200 for a few minutes, then 1184 for an hour, then 1163, for the next day. If you want to keep a consistent view, every time you want to change the 1200 to have a debit of 50 to say 1150, you have to change the 1184 to 1134, the ll63 to 1113, etc. Then you would have to create new journal entry rows for the changes with time stamps relating to the time that the balance was impacted. This would be an aweful mess. You might be able to trick it for an arbitrary time, but then it would make queries on both sides of it inconsistent unless you rewrite all of the the g_allocation_log entries for that account for each job you intent to import.

> Maybe I'm just trying to dig too deep into this, because it seems I
> can use
> a combination of glsjob + start/EndTimes to achieve what I'd like
> after job
> imports, but I'd rather not rewrite your gusage/statement/etc
> commands,
> especially since they natively work with the db, and they will
> probably be a
> heck of a lot more efficient than my perl ;-)

Well, OK. Now that I understand a bit better what you want, you could get away with most of what you want to do by altering the CreationTimes in the Transaction Table for the job. This will probably give you the desired listings in gusage. It will also list the appropriate charges in gstatement, but the think that it won't get right is the beginning and ending balances (which is what uses the journal). You will get some barking that the total of the debits and credits do not match the balances pulled from the allocation journal. I guess you can always disable that check, but just realize the Beginning and Ending Balances will not be accurate.


> On 10/1/10 6:44 PM, "Scott Jackson" <scottmo at adaptivecomputing.com>
> wrote:
> > Steve,
> >
> > It was introduced in gold-2.2. If you are using 2.1 you will not
> > have gmkjob.
> >
> > The imported jobs will not show up against the old statement, they
> > will show
> > up against a current statement only. However, the jobs will be
> > inserted into
> > the job table with the specified StartTime and EndTime (while
> > CreationTime and
> > ModificationTime will be current dates). You have to realize that
> > Gold tracks
> > the historical states of all allocations and objects and you can't
> > just insert
> > debits into the past. This would disrupt and change every
> > intervening state.
> > This would have effected balances causing other jobs not to have
> > been able to
> > run, invalidating previously generated statements, etc. You would
> > never see a
> > bank inserting credits or debits into their past balance sheets. If
> > a bank
> > catches a mistake, they correct it in the present with a current
> > refund or
> > withdrawal.
> >
> > When you import the job data, it does not affect balances. If you
> > wish to
> > catch up for the past charges, the charges will occur in the
> > present.
> >
> > Scott
> >
> >
> > ----- Original Message -----
> > From: "Steve Crusan" <scrusan at UR.Rochester.edu>
> > To: gold-users at supercluster.org
> > Sent: Friday, October 1, 2010 3:20:57 PM
> > Subject: [gold-users] importing old job data into gold
> >
> >
> > importing old job data into gold
> >
> > Quick question about importing old job data (from Maui logs) into
> > Gold.
> >
> > I've noticed that it seems to be possible to do this using the
> > goldsh shell
> > (in the manual it shows gmkjob, but I cannot seem to find it; is it
> > deprecated?):
> >
> >
> > goldsh Job Create JobId=PBS.$goldId.0 User=$netId Project=$account
> > Machine=$machine Charge=$charge Processors=$procs
> > StartTime=$startTime
> > EndTime=$endTime WallDuration=$walltime
> >
> >> From there, it seems one can use gcharge to charge the job:
> > gcharge -p $account -u $netId -m $machine -P $procs -t $walltime -j
> > $goldId -d
> > "charge imported from old data" -s $startTime -e $endTime
> >
> > Now, provided that I properly fill in the correct attributes, and
> > use the
> > proper gold job id, would doing the above operations properly keep
> > the
> > timestamps of the jobs? Meaning that if I have job logs from
> > September 2009,
> > if I import them and use their epoch time (do all the conversions,
> > etc), will
> > the gstatement commands and other querying operations reflect the
> > proper
> > dates/times?
> >
> > Ex:
> > gstatement -s 2009-09-10 -e 2009-12-25
> >
> > Will the imported jobs from the Maui logs properly show up within
> > that
> > statement, and NOT show up for a rather recent query?
> >
> > Thanks!
> >
> >
> >
> > ----------------------
> > Steve Crusan
> > System Administrator
> > Center for Research Computing
> >
> >
> > _______________________________________________
> > gold-users mailing list
> > gold-users at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/gold-users
> > _______________________________________________
> > gold-users mailing list
> > gold-users at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/gold-users
> ----------------------
> Steve Crusan
> System Administrator
> Center for Research Computing
> University of Rochester
> https://www.crc.rochester.edu/
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users

More information about the gold-users mailing list