[torqueusers] req_quejob explanation

Prakash Velayutham velayups at email.uc.edu
Mon Sep 5 12:21:52 MDT 2005


Garrick Staples wrote:

>On Sat, Sep 03, 2005 at 08:17:01PM -0400, Prakash Velayutham alleged:
>  
>
>>Hi,
>>
>>Can anyone tell me how a new incoming job from a client gets
>>processed in Torque? I can see that it is handled by the
>>wait_request() routine which in turn passes it to
>>process_request() call. My understanding is that in here, the
>>request gets sorted out into different types and if it is a
>>new job, it gets the tag of PBS_BATCH_QueueJob. Then it gets
>>processed by the dispatch_request() routine which calls
>>req_quejob.
>>
>>Now as far as I can see, the job only gets added to the
>>svr_newjobs list. What is the sequence after that? When
>>exactly does the job get into QUEUED state and when does the
>>job actually get sent to MOM? And what does the close_quejob
>>code do and why do we have that?
>>    
>>
>The QueueJob request is just the first step in a multi-stage process to
>get the job in state QUEUED.  It starts with a QueueJob and ends with a
>Commit.  Notice that QueueJob registers close_quejob() to be run when
>the submitter's socket close, and Commit unregisters it.  It is only
>after the Commit that the job is actually enqueued and is eligible to be
>run.
>
>If the submitting network socket is prematurely closed, close_quejob()
>does some cleanup work for the partially submitted job.
>
>Then the scheduler retrieves the job info with a StatusJob request, does
>its own work to find nodes, issues a ModifyJob request to put the hosts
>in the job's exec_host attribute, then issues a RunJob request which
>gets to req_runjob() and svr_startjob().
>
>I skipped a lot of details there, but hopefully that is enough of an
>overview to help.
>
Thanks Garrick. That was great help. I will try to figure out the rest.

Regards,
Prakash


More information about the torqueusers mailing list