[torquedev] Confused about processes again
Ken Nielson
knielson at adaptivecomputing.com
Thu Feb 25 09:25:14 MST 2010
Simon Toth wrote:
> I'm still battling the processes support in Torque. I can't make heads
> or tails of it. What is the correct semantics?
>
> - when can a node accept a shared job?
> - when can a node accept an exclusive job?
> - what is the correct state of node after accepting
> a shared job in free state?
> - what is the correct state of node after accepting
> an exclusive job in free state?
> - what does nd_nsnshared store?
>
>
Garrick has already pointed out that jobs are not shared or exclusive
etc. This is a designation for a node.
As long as a node can accept more work TORQUE will designate the state
of the node as free. For example if you configure a node with np=4 you
can run up to four processes on that node. Until the fourth process
starts the node will be shown to be free. Once the fourth job starts the
state will change to job-exclusive.
If in the server_priv/nodes file you designate a node as time shared
(i.e. node1:ts) then that node will allow as many jobs as possible to
run on that node until resources run out. However, that node will not be
allowed to share a job with another node. That is I cannot have part of
a job run on the timeshared node and another part of the job run on a
separate node. The job must be run exclusively on that node.
The nd_nsnshared element of the pbsnode structure keeps track of how
many jobs have been allocated to this node.
Ken Nielson
More information about the torquedev
mailing list