[gold-users] Is it necessary to do both gquote and greserve at job submit time when integrating resource manage system?

Michael Sternberg sternberg at anl.gov
Mon Mar 21 22:56:16 MDT 2011


I use Torque and deployed a job submission filter that simply checks the balance. If the job requests more, the submission is rejected by this submit filter (communicated to qsub via exit code). Such a job never reaches the resource manager queue. If, however, the submitted job merely amounts to a certain percentage of the available balance, the script warns the user of the low balance but succeeds. The user can then arrange for a new allocation. Works very well here in practice.

Once ingested by the RM, each job will be passed to the scheduler which will do its own Gold interaction. This will process jobs sequentially, requesting a quote and then a reservation. This means job 2 in your example will fail to get past the quote stage and would be marked ineligible for execution (blocked/held), rather than fail.


Michael.


On Mar 21, 2011, at 22:32, "Wei Lin" <weilin at platform.com> wrote:

> The proposed integration will do a job quote at job submission time, a job reservation at job start time, and a job charge when the job completes. 
> After reading the Gold User Guide, I think this would leave open the possibility of a user submitting more jobs than he/she can afford, and then having the job fail at job start time.
> 
> My scenario.
> 
> A user has 1000 credits (whatever they are called). The user submits 2 jobs immediately one after the other. Each job gets a quote saying they will spend 600 credits. The next scheduling cycle LSF dispatches the two job. The 2nd one will fail because there aren't enough credits in the bank.
> 
> I am just going by what the Gold user guide says which in places isn't a lot. I don't think obtaining a quote changes one's balance, but a reservation does. If I'm right getting a quote and reserving may need to be done together at submission time.


More information about the gold-users mailing list