[gold-users] transaction ids for jobs

Crusan, Steve scrusan at UR.Rochester.edu
Wed Dec 8 20:19:34 MST 2010


   Scenario:   we have ~600,000 jobs we'd like to import into gold. I've done some testing, and I have found that directly adding the job records to the job table via a SQL bulk insert goes much, much faster than using gmkjob. Just as a pretext, it took ~15 minutes (on a 2 proc, 512 mb development box)  via SQL vs. 4 days using gmkjob. I'll be running the bulk insert (no matter which method) on a box with 48g of RAM and 12 processors. Basically, I'm going to bootstrap Gold on a large memory node, and then export the database to a fresh install on our usual gold box.

               My question is, if this was a clean install of gold (no transactions besides user/project creation), what would the effect of not using the gmkjob commands be on the transaction logs? I would not like to find out that by bypassing the gmkjob command, that the gold transaction ids will conflict upon normal usage of gold. Since modifying the gold database directly (via the g_job table), would it be necessary to also make entries in the g_transaction table? After the import of old jobs is completed, the next step would be to turn on gold to serve new job requests on our cluster. 

Steve Crusan
System Administrator
Center for Research Computing

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/gold-users/attachments/20101208/c99cefbc/attachment.html 

More information about the gold-users mailing list