[Mauiusers] MAXNODE limit

Lennart Karlsson Lennart.Karlsson at nsc.liu.se
Thu Mar 29 09:59:20 MDT 2007


Josh,

Yes, your MAXPROC=4 configuration successfully blocks your "-l nodes=90:ppn=1"
job. I agree on that.

What I say in the Bugzilla post is that for two-processor nodes, a MAXPROC=100
does not block a "-l nodes=90:ppn=1" job, although it will allocate 90 nodes,
i.e. 180 processors.

Because of that, MAXPROC is not the correct tool and I need MAXNODE
to work.

I am a little confused that you say that "Moab" successfully does the blocking,
but I presume that you actually have used Maui.

Does your MAXPROC=4 configuration successfully block an "-l nodes=3:ppn=1"
job, when JOBNODEMATCHPOLICY is set to to EXACTNODE and NODEACCESSPOLICY is
set to SINGLEJOB? For me, on two-processor nodes, it does not and I see no way
to use MAXPROC to emulate a non-working MAXNODE.

In less technical terms, it seems like Maui does not understand how many
processors a job will allocate, until the job is running.

So please, I would like MAXNODE to work in Maui.

Best regards,
-- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
   National Supercomputer Centre in Linkoping, Sweden
   http://www.nsc.liu.se


Joshua Butikofer wrote:
> After investigating this bug (and its alternate description in Bugzilla) it appears that you need to
> use MAXPROC instead of MAXNODE when JOBNODEMATCHPOLICY is set to EXACTNODE. (The Maui documentation
> mentions this as well @
> http://www.clusterresources.com/products/maui/docs/6.2throttlingpolicies.shtml under MAXNODE.)
> 
> The Bugzilla post mentions that you already tried MAXPROC and that specifying -l nodes=90:ppn=1
> still allows the job to run. In my tests, however, Moab successfully blocks the job with my
> MAXPROC=4 for the user/group 'josh':
> 
> PE:  90.00  StartPriority:  11001
> cannot select job 81 for partition DEFAULT (job 81 violates active HARD MAXPROC limit of 4 for user
> josh  (R: 90, U: 0)
> )
> 
> I also tried setting the policy on a QoS and it too worked as expected. Could you please send me a
> scenario to show me how the MAXPROC was failing for you? If the job succeeds in running, could you
> also send me a "checkjob -v <JOB>" output?
> 
> Thanks,
> 
> -- 
> Joshua Butikofer
> Cluster Resources, Inc.
> 
> josh at clusterresources.com
> Voice: (801) 717-3707
> Fax:   (801) 717-3738
> --------------------------
> 
> 
> Lennart Karlsson wrote:
> > Josh,
> > 
> > You wrote:
> >> I would recommend trying out the patch 19 snapshot and see if you
> >> experience any problems. We hope to get the official release out over
> >> the next few days, and this release would eradicate all known bugs. 
> > 
> > 
> > My most critical Maui bug is logged in your bugzilla as number 141.
> > (There are also a bug number 83, that looks similar.)
> > 
> > Please include it within "all known bugs", that you are fixing now! I would
> > really appreciate that.
> > 
> > The MAXNODE configuration parameter does not work.
> > 
> > It should be easy for you to repeat the problem on your systems:
> > 
> > 1/ Start with a simple Maui configuration like (I skip the
> > SERVER*/ADMIN/RMCFG/RMPOLLINTERVAL/LOG* preambles):
> > 
> > QUEUETIMEWEIGHT         10 
> > XFACTORWEIGHT           1
> > QOSWEIGHT               1
> > 
> > FSPOLICY                [NONE]
> > 
> > BACKFILLPOLICY          BESTFIT
> > NODEALLOCATIONPOLICY    LASTAVAILABLE
> > RESERVATIONPOLICY       CURRENTHIGHEST
> > RESERVATIONDEPTH        10
> > 
> > JOBPRIOACCRUALPOLICY    FULLPOLICY
> > 
> > NODEACCESSPOLICY        SINGLEJOB
> > JOBNODEMATCHPOLICY      EXACTNODE
> > 
> > QOSCFG[DEFAULT]  PRIORITY=10000  XFWEIGHT=1000 QTWEIGHT=4
> > 
> > 2/ Add MAXNODE lines for a user and the group of that user, like:
> > 
> > USERCFG[lka]    MAXNODE=5
> > GROUPCFG[nsc]   MAXNODE=5
> > 
> > 3/ Submit a lot of jobs as that user and wait until her/his jobs run on
> > a total of at least five nodes.
> > 
> > 4/ Run a 'showq' and look at all the jobs of that user, that should be
> > 'blocked', but actually is 'idle' (the demonstration is done on a system
> > where each node has only one processor, and here MAXNODE could be
> > substituted with a MAXPROC, but most of our systems have more than one
> > processor on each node):
> > 
> >  # showq
> > ACTIVE JOBS--------------------
> > JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME
> > 
> > 55818                   lka    Running     5    00:05:24  Thu Feb 15 13:26:04
> > 55819                   lka    Running     1    00:06:01  Thu Feb 15 13:26:41
> > 55820                   lka    Running     1    00:06:02  Thu Feb 15 13:26:42
> > 55821                   lka    Running     1    00:06:33  Thu Feb 15 13:27:13
> > 55822                   lka    Running     1    00:06:34  Thu Feb 15 13:27:14
> > 55823                   lka    Running     1    00:06:35  Thu Feb 15 13:27:15
> > 55824                   lka    Running     1    00:06:35  Thu Feb 15 13:27:15
> > 55807               andersb    Running    20 11:08:46:33  Wed Feb 14 11:07:13
> > 
> >      8 Active Jobs      31 of   31 Processors Active (100.00%)
> > 
> > IDLE JOBS----------------------
> > JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME
> > 
> > 55825                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:15
> > 55826                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:16
> > 55827                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:16
> > 55828                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:17
> > 55829                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:17
> > 55830                   lka       Idle     1     1:00:00  Thu Feb 15 13:27:17
> > 
> > 6 Idle Jobs
> > 
> > BLOCKED JOBS----------------
> > JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME
> > 
> > 5/ Only job number 55818 should be running, the other 'lka' jobs should
> > be 'blocked' and neither 'running' nor 'idle'.
> > 
> > 
> > The demo was run with Maui version 3.2.6p19-snap.1171482917.
> > 
> > I would at least like the MAXNODE parameter to work for GROUP, QOS or
> > CLASS, but of course it would be nice to have it working also on USER,
> > please.
> > 
> > Best regards,
> > -- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
> >    National Supercomputer Centre in Linkoping, Sweden
> >    http://www.nsc.liu.se
> > 
> > 
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
> 




More information about the mauiusers mailing list