[torquedev] Confused about processes again

Ken Nielson knielson at adaptivecomputing.com
Thu Feb 25 09:25:14 MST 2010


Simon Toth wrote:
> I'm still battling the processes support in Torque. I can't make heads
> or tails of it. What is the correct semantics?
>
> - when can a node accept a shared job?
> - when can a node accept an exclusive job?
> - what is the correct state of node after accepting
>   a shared job in free state?
> - what is the correct state of node after accepting
>   an exclusive job in free state?
> - what does nd_nsnshared store?
>
>   
Garrick has already pointed out that jobs are not shared or exclusive 
etc. This is a designation for a node.

As long as a node can accept more work TORQUE will designate the state 
of the node as free. For example if you configure a node with np=4 you 
can run up to four processes on that node. Until the fourth process 
starts the node will be shown to be free. Once the fourth job starts the 
state will change to job-exclusive.

If in the server_priv/nodes file you designate a node as time shared 
(i.e. node1:ts) then that node will allow as many jobs as possible to 
run on that node until resources run out. However, that node will not be 
allowed to share a job with another node. That is I cannot have part of 
a job run on the timeshared node and another part of the job run on a 
separate node. The job must be run exclusively on that node.

The nd_nsnshared element of the pbsnode structure keeps track of how 
many jobs have been allocated to this node.

Ken Nielson


More information about the torquedev mailing list