[torqueusers] [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Wed Mar 30 04:33:38 MDT 2011


> -----Original Message-----
> From: Mahmood Naderan [mailto:nt_mahmood at yahoo.com]
> Sent: Wednesday, 30 March 2011 7:48 PM
> To: Williams, Gareth (CSIRO IM&T, Docklands)
> Cc: torque cluster
> Subject: Re: [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC
> 
> As the original author, I have to say thank to Gareth for the indepth
> response.
> 
> >MAXIJOB limits how start priorities are calculated on queue jobs,
> > in particular, how many jobs are eligible to be started and
> >how priority factors based on time in the queue are calculated.
> >Jobs blocked by a MAXIJOB limit do not accrue priority
> >based on XFACTOR or QUEUETIME and probably other related factors).
> 
> Yes I saw that in my configuration. I set only a limit MAXPROC=9. Assume
> the
> submitted jobs are uniprocessors but the number of submitted jobs are
> large. So
> each user can not exceed 9 jobs. What I saw in reality was that if I
> submitt 30
> jobs, 9 are running and the remaining 21 jbs are *blocked* and not queued.
> So
> the XFACTOR and aging are not calculated.
> 
> I was searching for a way to fix that. Based on your explanation, I have
> to set
> MAXIJOB. In the above example if I set MAXIJOB=5, then
> 
> 
> 9 are running,
> 5 are queued and XFACTOR is calculated,
> 16 are blocked.
> 
> Is that the solution? Did I understand correctly?
> 
> // Naderan *Mahmood;

I'm not sure...

I think the feature is mostly intended to limit the number of 'eligible' jobs for each user/credential when the cluster is fully occupied by a _mix_ of jobs.  It is possible/likely that if you hit the MAXPROC limits (or another MAX... limit) then all of your other jobs will be deferred and none will be considered eligible. ie. I think MAXI... limits only apply if you are not hitting MAX... limits.  It should be easy enough to test!

- Gareth

> 
> 
> 
> ----- Original Message ----
> From: "Gareth.Williams at csiro.au" <Gareth.Williams at csiro.au>
> To: sorrillo at jlab.org; mauiusers at supercluster.org
> Sent: Tue, March 29, 2011 3:34:40 AM
> Subject: Re: [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC
> 
> 
> 
> > -----Original Message-----
> > From: sorrillo at jlab.org [mailto:sorrillo at jlab.org]
> > Sent: Monday, 28 March 2011 2:15 PM
> > To: mauiusers at supercluster.org
> > Subject: [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC
> >
> >
> > Someone asked this question regarding Maui:
> >
> > Hi,
> > 1- Is there any relation between MAXJOB and MAXIJOB? According to
> manual:
> > MAXJOB: Limits the number of jobs a credential may have active (starting
> > or
> > running) at any given time
> >
> > MAXIJOB: Idle (or queued) job limits control which jobs are eligible for
> > scheduling
> 
> In one sense, there is no relationship between the parameters. MAXJOB is
> straightforward - the number of jobs that can be run concurrently (usually
> per
> user but could be per queue or related to another credential).  MAXIJOB
> limits
> how start priorities are calculated on queue jobs, in particular, how many
> jobs
> are eligible to be started and how priority factors based on time in the
> queue
> are calculated.  Jobs blocked by a MAXIJOB limit do not accrue priority
> based on
> XFACTOR or QUEUETIME and probably other related factors).
> 
> >
> > Now what does "USERCFG[default] MAXJOB=10 MAXIJOB=3" mean?
> > a. each user can *run* 10 jobs and *enqueue* 3 jobs. So the total jobs
> of
> > each
> > user is 13.
> >
> > b. each user can *run* 7 jobs and  *enqueue* 3 jobs. So the total jobs
> of
> > each
> > user is 10.
> 
> Neither. Each user can *run* up to 10 jobs and 3 of their queued jobs will
> accrue priority and be considered for starting.  MAXIJOB is mostly (IMHO)
> a
> mechanism for moderating 'queue stuffing' where an individual submitting
> many
> jobs blocks out other users. Otherwise I'm pretty sure maui will only
> consider a
> limited number of jobs each cycle anyway and the top of the queue may be
> dominated by one user, with other users jobs never getting considered.
> 
> 
> > 2- If I have defined "USERCFG[default] MAXJOB=2 MAXIJOB=1 MAXPROC=4",
> and
> > a user
> >
> > submit requesting these resources:
> > job1: -l nodes=7
> > job2: -l nodes=1:ppn=1
> > job3: -l nodes=4:ppn=2
> > job4: -l nodes=10
> >
> > what is the result?
> 
> Jobs must satisfy all limits, so the MAXPROC setting would prevent all of
> these
> from running except the serial job.
> >
> > 3- If I have 10 chassis each has 16 processors, I can set MAXNODE=1 to
> > ensure
> > that a user can no longer request more that 16 processors (in another
> word
> > it
> > will implicitly define MAXPROC=16). Is that right?
> 
> Yes this would effectively prevent the user from requesting more than 16
> processors, but it would not be the same thing as MAXPROC=16. MAXNODE=1
> would
> mean they could only run on one node, (maybe with multiple jobs sharing
> the
> node... not sure) but MAXPROC=16 would limit the total number of
> processors
> across all the user jobs to 16.  eg. they could have 16 concurrently
> running
> serial jobs which might (partially) occupy anywhere between 1 and 16
> nodes.
> 
> 
> > Also I am assuming that I can aggregate (like so) for an account called
> > "tomil",
> >
> > ACCOUNTCFG[tomil] FSTARGET=0.1-  MAXPROC=16,64 MAXJOB=1,3
> >
> > to explicitly and simultaneously throttle both the maximum number of
> cores
> > and the jobs the account can run? We have dual quad core nodes.
> 
> The set of running tomil jobs must satisfy both conditions.  The soft
> limit part
> will apply if there are no other queued jobs.
> 
> Note that "dual quad core nodes" seems inconsistent with "10 chassis each
> has 16
> processors", but that is probably not important.
> 
> - Gareth
> > Thanks,
> >
> >
> >
> >
> 
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers



More information about the torqueusers mailing list