[torquedev] nodes, procs, tpn and ncpus
glen.beane at gmail.com
Wed Jun 9 17:52:32 MDT 2010
On Wed, Jun 9, 2010 at 6:22 PM, Ken Nielson
<knielson at adaptivecomputing.com> wrote:
> On 06/09/2010 03:48 PM, Ken Nielson wrote:
>> On 06/09/2010 03:05 PM, Glen Beane wrote:
>>> I don't believe I have anything else going on. You have been wrong on
>>> the behavior of other specs, like nodex=X:ppn=Y only allocating one
>>> node. It appears you have something else going on than every other
>>> TORQUE user.
>>> I can't gdb pbs_server on a production cluster to see where I am
>>> different, maybe someone else can do that on a test cluster.
>>> Anyway, the behavior you are seeing is not correct. It is either a bug
>>> in torque or a problem with your setup.
>> We figured it out. The difference is I have "set server
>> resources_available.nodect = 1024" in my configuration and you do not.
>> This sets the global SvrNodeCt which affects the behavior. Why it does
>> is my next task.
> With this discovery I have found that TORQUE interprets nodes to be
> separate hosts.
> So -l nodes=3 <job.sh> will allocate three different hosts for the job
> making the behavior of nodes different the procs.
now that I think about it I think the resources_available.nodect might
be a bug I heard about a _long_ time ago.
so without this set you see same behavior I did? Where nodes=5 would
give you one processor on 5 unique nodes? I think that is a more
reasonable behavior than giving you one node! In the OpenPBS days
this would give you 5 complete nodes (exclusive use of 5 nodes), which
makes sense to me, but I don't think we can revert to that behavior
since it is really incompatible Moab uses nodes (nodes=5 would mean a
different number of processors in Moab and TORQUE then).
More information about the torquedev