[torqueusers] req_quejob explanation
garrick at usc.edu
Sun Sep 4 15:40:05 MDT 2005
On Sat, Sep 03, 2005 at 08:17:01PM -0400, Prakash Velayutham alleged:
> Can anyone tell me how a new incoming job from a client gets
> processed in Torque? I can see that it is handled by the
> wait_request() routine which in turn passes it to
> process_request() call. My understanding is that in here, the
> request gets sorted out into different types and if it is a
> new job, it gets the tag of PBS_BATCH_QueueJob. Then it gets
> processed by the dispatch_request() routine which calls
> Now as far as I can see, the job only gets added to the
> svr_newjobs list. What is the sequence after that? When
> exactly does the job get into QUEUED state and when does the
> job actually get sent to MOM? And what does the close_quejob
> code do and why do we have that?
The QueueJob request is just the first step in a multi-stage process to
get the job in state QUEUED. It starts with a QueueJob and ends with a
Commit. Notice that QueueJob registers close_quejob() to be run when
the submitter's socket close, and Commit unregisters it. It is only
after the Commit that the job is actually enqueued and is eligible to be
If the submitting network socket is prematurely closed, close_quejob()
does some cleanup work for the partially submitted job.
Then the scheduler retrieves the job info with a StatusJob request, does
its own work to find nodes, issues a ModifyJob request to put the hosts
in the job's exec_host attribute, then issues a RunJob request which
gets to req_runjob() and svr_startjob().
I skipped a lot of details there, but hopefully that is enough of an
overview to help.
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050904/c58bdc47/attachment.bin
More information about the torqueusers