[Mauiusers] Late Maui snapshot sums up active jobs badly

Lennart Karlsson Lennart.Karlsson at nsc.liu.se
Thu Sep 15 03:39:40 MDT 2005


Bas van der Vlies wrote:
> Lennart Karlsson wrote:
> > Using Maui version maui-3.2.6p14-snap.1125680408 and
> > Torque version 1.2.0p6, there seems to be a factor two
> > error in simple arithemetics.
> > 
> > In the demonstration below user 'mahul' should be able to run
> > his three queued one-processor jobs, because the limit is eight
> > and he is merely using four. But Maui thinks that he already
> > uses eight processors.
> > 
> > I can tell that Maui has the same factor two problem counting MAXPS,
> > although I will not clutter up this e-mail with a demonstration.
> > 
> Do this nodes have more then one cpu? (Torque config np=2)
> If so how how do you allocate the nodes. Can there only be one job 
> active on one node or is it shared?
> I had the same problem with an earlier maui version with MAXPS factor 
> two problem. My cocnfiguration is that we schedule one job per node and 
> each node has 2 cpu's.
> Maui sees that has two task per node and we had an problem that MAXPS 
> for qsub -I -lnodes=1 and qsub -I -lnodes=1:ppn=2 is calculated 
> differently. Maybe is the same problem you encounter.
> There is an maui thread about this subject.
> I have an patched maui version that does the right calculation if we 
> only schedule one job per node for SMP machines.

Thank you, but each node has only one CPU and there is no node sharing.

It seems like Maui knows the number of CPUs used as (from 'showq' output
I gave in the demonstration):

35264                 salam    Running     5  3:04:27:09  Wed Sep 14 14:28:56

but multiplies with two when trying to check how much is used (from
'diagnose -q' output I gave in the demonstration, please note 'U: 10'):

job 35292 violates active SOFT MAXPROC limit of 8 for user salam  (R: 1, U: 10)

-- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
   National Supercomputer Centre in Linkoping, Sweden

