[torqueusers] Problem with nodes allocation
glen.beane at gmail.com
Thu Jul 3 05:05:18 MDT 2008
On Thu, Jul 3, 2008 at 6:56 AM, Glen Beane <glen.beane at gmail.com> wrote:
> On Thu, Jul 3, 2008 at 4:50 AM, Roger Williams <R.Williams at gns.cri.nz>
>> A reader sent me this mail reply, for which I'm grateful. Not sure if it
>> made it to the list ...
>> >I find the problem here is the confusion between "nodes" and "cpus". I
>> >understand that nodes=5 means you want 5 CPUs not 5 nodes! I always
>> >recommend using ":ppn=X", ie -l nodes=5:ppn=2 to get 5 x 2CPU nodes.
>> >The $PBS_NODEFILE lists the node name of each CPU assigned, which
>> >would mean that in your case the $PBS_NODEFILE would have the first node
>> >listed five times.
>> No. I see only *one* line in $PBS_NODEFILE in the case that I (and the
>> others) have experienced. That is the symptom.
>> I have also tried the -l nodes=x:ppn=y syntax (I have 8 cores in each
>> node). In those trials I get y lines of the first node only.
>> Again, can any of the Torque developers say if the cause of this problem
>> was definitively identified and, if so, has it been fixed?
> I seem to remember this bug being introduced in 2.2.0 or thereabouts and
> fixed in a later release.
> I would stick with TORQUE 2.1.10 or 2.3.1
OK, I found reports of this in some 2.1.x versions of TORQUE as well, and I
can't find any info that it was actually fixed yet.
What scheduler are you using? It sounds like a bug that could be in
pbs_sched. Would it be possible for you to try Maui? Can you send me your
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers