[gold-users] gcharge issued twice via Torque's epilogue script

Scott Jackson scottmo at adaptivecomputing.com
Tue Dec 22 16:02:09 MST 2009


Kevin,

Sounds good. Unfortunately, that may be about the best you can do until 
Torque gets fixed.

Hmmm.... Actually...

Now that I think about it, there would be one other thing to do. If you 
did something in your epilog that flipped a per-jobid semaphore or 
something, you would only call the charge if the thing had not already 
been flipped. Maybe the presence of a file, or an entry in a database 
(protected by locking the row).

So for example, Begin Work; Insert into Jobs set Jobid=$JobId; Commit. 
If that succeeds, you do the charge. If it fails due to an existing 
entry, log the dup invocation to a log.

I think that is a tad better than refunding after the fact.

Scott


Kevin Van Workum wrote:
>
>
> On Tue, Dec 22, 2009 at 2:53 PM, Kevin Van Workum <vanw at sabalcore.com 
> <mailto:vanw at sabalcore.com>> wrote:
>
>     On Tue, Dec 22, 2009 at 2:51 PM, Kevin Van Workum
>     <vanw at sabalcore.com <mailto:vanw at sabalcore.com>> wrote:
>     > On Tue, Dec 22, 2009 at 12:46 PM, Wojciech Turek
>     <wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>> wrote:
>     >>
>     >> What about glstxn -J <job id> ? You could use this command in
>     your epilogue script to check if charge transaction was made for
>     the particular jobid.
>     >
>     > Yes, I tried that, but surprisingly that doesn't always work. It
>     > appears that gcharge is implemented asynchronously, at least wrt
>     > glstxn. Here's a simplified snippet of my epilogue script (perl) and
>     > glstxn output for a job that still got charged twice.
>     >
>     > #!/usr/bin/perl
>     >
>     > open LG, "glstxn -J $jobid|";
>
>     Before you mention it, I'm actually using "glstxn --quite -J
>     $jobid|" here.
>
>     > @buf = <LG>;
>     > close LG;
>     >
>     > if(@buf == 0) {
>     >    system("gcharge $args");
>     > } else {
>     >    print STDERR "$jobid has already been charged ", @buf+0, "
>     times\n";
>     > }
>     >
>     >
>     > # glstxn -J 230465.jman --show JobId,Id,CreationTime
>     > JobId       Id     CreationTime
>     > ----------- ------ -------------------
>     > 230465.jman 847102 2009-12-22 14:35:32
>     > 230465.jman 847107 2009-12-22 14:35:32
>     >
>     >
>
>
> FYI, I decided to just run a cronjob every night that searches for 
> duplicated charges and refunds the extra charges.
>
> Kevin
>  
>
>     >>
>     >> Cheers
>     >>
>     >> Wojciech
>     >>
>     >> 2009/12/22 Scott Jackson <scottmo at adaptivecomputing.com
>     <mailto:scottmo at adaptivecomputing.com>>
>     >>>
>     >>> Kevin,
>     >>>
>     >>> No, I'm sorry. There is not. Gold will charge for a job as
>     many times as
>     >>> it is called. There are provisions for incremental charging
>     where it all
>     >>> goes against the same job instance, and if not, it considers them
>     >>> separate jobs with the same jobid. All I can think of is that
>     you could
>     >>> write a wrapper script that looks up the jobid and if it has
>     already
>     >>> been charged that same day, ignores the second charge.
>     >>>
>     >>> I assume you have a ticket open with the Torque support queue
>     on this.
>     >>>
>     >>> Scott
>     >>>
>     >>>
>     >>> Kevin Van Workum wrote:
>     >>> > I use Torque's epilogue script to issue the gcharge command
>     after a
>     >>> > job completes. However, it occasionally happens that the
>     epilogue
>     >>> > script runs twice for a given job. This happens when Torque
>     sends a
>     >>> > sigkill a few seconds after the initial sigterm is sent.
>     Though I'd
>     >>> > like to prevent the script from running twice, I haven't had
>     much
>     >>> > success. So, I'm now searching for a solution though gold.
>     >>> >
>     >>> > Is there a way to have gold ignore duplicate charges for the
>     same JobId?
>     >>> >
>     >>> > --
>     >>> > Kevin Van Workum, PhD
>     >>> > Sabalcore Computing Inc.
>     >>> > Run your code on 500 processors.
>     >>> > Sign up for a free trial account.
>     >>> > www.sabalcore.com <http://www.sabalcore.com>
>     <http://www.sabalcore.com>
>     >>> > 877-492-8027 ext. 11
>     >>> >
>     ------------------------------------------------------------------------
>     >>> >
>     >>> > _______________________________________________
>     >>> > gold-users mailing list
>     >>> > gold-users at supercluster.org <mailto:gold-users at supercluster.org>
>     >>> > http://www.supercluster.org/mailman/listinfo/gold-users
>     >>> >
>     >>>
>     >>> _______________________________________________
>     >>> gold-users mailing list
>     >>> gold-users at supercluster.org <mailto:gold-users at supercluster.org>
>     >>> http://www.supercluster.org/mailman/listinfo/gold-users
>     >>
>     >>
>     >>
>     >> --
>     >> --
>     >> Wojciech Turek
>     >>
>     >> Assistant System Manager
>     >>
>     >> High Performance Computing Service
>     >> University of Cambridge
>     >> Email: wjt27 at cam.ac.uk <mailto:wjt27 at cam.ac.uk>
>     >> Tel: (+)44 1223 763517
>     >
>     >
>     >
>     > --
>     > Kevin Van Workum, PhD
>     > Sabalcore Computing Inc.
>     > Run your code on 500 processors.
>     > Sign up for a free trial account.
>     > www.sabalcore.com <http://www.sabalcore.com>
>     > 877-492-8027 ext. 11
>     >
>
>
>
>     --
>     Kevin Van Workum, PhD
>     Sabalcore Computing Inc.
>     Run your code on 500 processors.
>     Sign up for a free trial account.
>     www.sabalcore.com <http://www.sabalcore.com>
>     877-492-8027 ext. 11
>
>
>
>
> -- 
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> Run your code on 500 processors.
> Sign up for a free trial account.
> www.sabalcore.com <http://www.sabalcore.com>
> 877-492-8027 ext. 11
> ------------------------------------------------------------------------
>
> _______________________________________________
> gold-users mailing list
> gold-users at supercluster.org
> http://www.supercluster.org/mailman/listinfo/gold-users
>   



More information about the gold-users mailing list