[torqueusers] [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC

Mahmood Naderan nt_mahmood at yahoo.com
Wed Mar 30 02:48:07 MDT 2011


As the original author, I have to say thank to Gareth for the indepth response. 

>MAXIJOB limits how start priorities are calculated on queue jobs,
> in particular, how many jobs are eligible to be started and 
>how priority factors based on time in the queue are calculated.  
>Jobs blocked by a MAXIJOB limit do not accrue priority 
>based on XFACTOR or QUEUETIME and probably other related factors).

Yes I saw that in my configuration. I set only a limit MAXPROC=9. Assume the 
submitted jobs are uniprocessors but the number of submitted jobs are large. So 
each user can not exceed 9 jobs. What I saw in reality was that if I submitt 30 
jobs, 9 are running and the remaining 21 jbs are *blocked* and not queued. So 
the XFACTOR and aging are not calculated.

I was searching for a way to fix that. Based on your explanation, I have to set 
MAXIJOB. In the above example if I set MAXIJOB=5, then 


9 are running,
5 are queued and XFACTOR is calculated,
16 are blocked.

Is that the solution? Did I understand correctly?
 
// Naderan *Mahmood;



----- Original Message ----
From: "Gareth.Williams at csiro.au" <Gareth.Williams at csiro.au>
To: sorrillo at jlab.org; mauiusers at supercluster.org
Sent: Tue, March 29, 2011 3:34:40 AM
Subject: Re: [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC



> -----Original Message-----
> From: sorrillo at jlab.org [mailto:sorrillo at jlab.org]
> Sent: Monday, 28 March 2011 2:15 PM
> To: mauiusers at supercluster.org
> Subject: [Mauiusers] Regarding MAXJOB, MAXIJOB and MAXPROC
> 
> 
> Someone asked this question regarding Maui:
> 
> Hi,
> 1- Is there any relation between MAXJOB and MAXIJOB? According to manual:
> MAXJOB: Limits the number of jobs a credential may have active (starting
> or
> running) at any given time
> 
> MAXIJOB: Idle (or queued) job limits control which jobs are eligible for
> scheduling

In one sense, there is no relationship between the parameters. MAXJOB is 
straightforward - the number of jobs that can be run concurrently (usually per 
user but could be per queue or related to another credential).  MAXIJOB limits 
how start priorities are calculated on queue jobs, in particular, how many jobs 
are eligible to be started and how priority factors based on time in the queue 
are calculated.  Jobs blocked by a MAXIJOB limit do not accrue priority based on 
XFACTOR or QUEUETIME and probably other related factors).

> 
> Now what does "USERCFG[default] MAXJOB=10 MAXIJOB=3" mean?
> a. each user can *run* 10 jobs and *enqueue* 3 jobs. So the total jobs of
> each
> user is 13.
>
> b. each user can *run* 7 jobs and  *enqueue* 3 jobs. So the total jobs of
> each
> user is 10.

Neither. Each user can *run* up to 10 jobs and 3 of their queued jobs will 
accrue priority and be considered for starting.  MAXIJOB is mostly (IMHO) a 
mechanism for moderating 'queue stuffing' where an individual submitting many 
jobs blocks out other users. Otherwise I'm pretty sure maui will only consider a 
limited number of jobs each cycle anyway and the top of the queue may be 
dominated by one user, with other users jobs never getting considered.


> 2- If I have defined "USERCFG[default] MAXJOB=2 MAXIJOB=1 MAXPROC=4", and
> a user
> 
> submit requesting these resources:
> job1: -l nodes=7
> job2: -l nodes=1:ppn=1
> job3: -l nodes=4:ppn=2
> job4: -l nodes=10
> 
> what is the result?

Jobs must satisfy all limits, so the MAXPROC setting would prevent all of these 
from running except the serial job.
> 
> 3- If I have 10 chassis each has 16 processors, I can set MAXNODE=1 to
> ensure
> that a user can no longer request more that 16 processors (in another word
> it
> will implicitly define MAXPROC=16). Is that right?

Yes this would effectively prevent the user from requesting more than 16 
processors, but it would not be the same thing as MAXPROC=16. MAXNODE=1 would 
mean they could only run on one node, (maybe with multiple jobs sharing the 
node... not sure) but MAXPROC=16 would limit the total number of processors 
across all the user jobs to 16.  eg. they could have 16 concurrently running 
serial jobs which might (partially) occupy anywhere between 1 and 16 nodes.


> Also I am assuming that I can aggregate (like so) for an account called
> "tomil",
> 
> ACCOUNTCFG[tomil] FSTARGET=0.1-  MAXPROC=16,64 MAXJOB=1,3
> 
> to explicitly and simultaneously throttle both the maximum number of cores
> and the jobs the account can run? We have dual quad core nodes.

The set of running tomil jobs must satisfy both conditions.  The soft limit part 
will apply if there are no other queued jobs.

Note that "dual quad core nodes" seems inconsistent with "10 chassis each has 16 
processors", but that is probably not important.

- Gareth
> Thanks,
> 
> 
> 
> 

_______________________________________________
mauiusers mailing list
mauiusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers



More information about the torqueusers mailing list