[gold-users] gcharge issued twice via Torque's epilogue script
Scott Jackson
scottmo at adaptivecomputing.com
Tue Dec 22 16:02:09 MST 2009
Kevin,
Sounds good. Unfortunately, that may be about the best you can do until
Torque gets fixed.
Hmmm.... Actually...
Now that I think about it, there would be one other thing to do. If you
did something in your epilog that flipped a per-jobid semaphore or
something, you would only call the charge if the thing had not already
been flipped. Maybe the presence of a file, or an entry in a database
(protected by locking the row).
So for example, Begin Work; Insert into Jobs set Jobid=$JobId; Commit.
If that succeeds, you do the charge. If it fails due to an existing
entry, log the dup invocation to a log.
I think that is a tad better than refunding after the fact.
Scott
Kevin Van Workum wrote:
>
>
> On Tue, Dec 22, 2009 at 2:53 PM, Kevin Van Workum <vanw at sabalcore.com
> <mailto:vanw at sabalcore.com>> wrote:
>
> On Tue, Dec 22, 2009 at 2:51 PM, Kevin Van Workum
> <vanw at sabalcore.com <mailto:vanw at sabalcore.com>> wrote:
> > On Tue, Dec 22, 2009 at 12:46 PM, Wojciech Turek
> <wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>> wrote:
> >>
> >> What about glstxn -J <job id> ? You could use this command in
> your epilogue script to check if charge transaction was made for
> the particular jobid.
> >
> > Yes, I tried that, but surprisingly that doesn't always work. It
> > appears that gcharge is implemented asynchronously, at least wrt
> > glstxn. Here's a simplified snippet of my epilogue script (perl) and
> > glstxn output for a job that still got charged twice.
> >
> > #!/usr/bin/perl
> >
> > open LG, "glstxn -J $jobid|";
>
> Before you mention it, I'm actually using "glstxn --quite -J
> $jobid|" here.
>
> > @buf = <LG>;
> > close LG;
> >
> > if(@buf == 0) {
> > system("gcharge $args");
> > } else {
> > print STDERR "$jobid has already been charged ", @buf+0, "
> times\n";
> > }
> >
> >
> > # glstxn -J 230465.jman --show JobId,Id,CreationTime
> > JobId Id CreationTime
> > ----------- ------ -------------------
> > 230465.jman 847102 2009-12-22 14:35:32
> > 230465.jman 847107 2009-12-22 14:35:32
> >
> >
>
>
> FYI, I decided to just run a cronjob every night that searches for
> duplicated charges and refunds the extra charges.
>
> Kevin
>
>
> >>
> >> Cheers
> >>
> >> Wojciech
> >>
> >> 2009/12/22 Scott Jackson <scottmo at adaptivecomputing.com
> <mailto:scottmo at adaptivecomputing.com>>
> >>>
> >>> Kevin,
> >>>
> >>> No, I'm sorry. There is not. Gold will charge for a job as
> many times as
> >>> it is called. There are provisions for incremental charging
> where it all
> >>> goes against the same job instance, and if not, it considers them
> >>> separate jobs with the same jobid. All I can think of is that
> you could
> >>> write a wrapper script that looks up the jobid and if it has
> already
> >>> been charged that same day, ignores the second charge.
> >>>
> >>> I assume you have a ticket open with the Torque support queue
> on this.
> >>>
> >>> Scott
> >>>
> >>>
> >>> Kevin Van Workum wrote:
> >>> > I use Torque's epilogue script to issue the gcharge command
> after a
> >>> > job completes. However, it occasionally happens that the
> epilogue
> >>> > script runs twice for a given job. This happens when Torque
> sends a
> >>> > sigkill a few seconds after the initial sigterm is sent.
> Though I'd
> >>> > like to prevent the script from running twice, I haven't had
> much
> >>> > success. So, I'm now searching for a solution though gold.
> >>> >
> >>> > Is there a way to have gold ignore duplicate charges for the
> same JobId?
> >>> >
> >>> > --
> >>> > Kevin Van Workum, PhD
> >>> > Sabalcore Computing Inc.
> >>> > Run your code on 500 processors.
> >>> > Sign up for a free trial account.
> >>> > www.sabalcore.com <http://www.sabalcore.com>
> <http://www.sabalcore.com>
> >>> > 877-492-8027 ext. 11
> >>> >
> ------------------------------------------------------------------------
> >>> >
> >>> > _______________________________________________
> >>> > gold-users mailing list
> >>> > gold-users at supercluster.org <mailto:gold-users at supercluster.org>
> >>> > http://www.supercluster.org/mailman/listinfo/gold-users
> >>> >
> >>>
> >>> _______________________________________________
> >>> gold-users mailing list
> >>> gold-users at supercluster.org <mailto:gold-users at supercluster.org>
> >>> http://www.supercluster.org/mailman/listinfo/gold-users
> >>
> >>
> >>
> >> --
> >> --
> >> Wojciech Turek
> >>
> >> Assistant System Manager
> >>
> >> High Performance Computing Service
> >> University of Cambridge
> >> Email: wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>
> >> Tel: (+)44 1223 763517
> >
> >
> >
> > --
> > Kevin Van Workum, PhD
> > Sabalcore Computing Inc.
> > Run your code on 500 processors.
> > Sign up for a free trial account.
> > www.sabalcore.com <http://www.sabalcore.com>
> > 877-492-8027 ext. 11
> >
>
>
>
> --
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> Run your code on 500 processors.
> Sign up for a free trial account.
> www.sabalcore.com <http://www.sabalcore.com>
> 877-492-8027 ext. 11
>
>
>
>
> --
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> Run your code on 500 processors.
> Sign up for a free trial account.
> www.sabalcore.com <http://www.sabalcore.com>
> 877-492-8027 ext. 11
> ------------------------------------------------------------------------
>
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users
>
More information about the gold-users
mailing list