[Mauiusers] Backfill and node reservation
Arnau Bria
arnaubria at pic.es
Mon Nov 15 09:23:13 MST 2010
On Mon, 15 Nov 2010 13:57:57 -0200
Denis Denis wrote:
Hi,
> > > Could you send you maui.cfg?
> > Sure (I've added a couple of node bewteen lines).
> >
> >
> > SERVERHOST NAME
> > ADMIN1 root
> > ADMIN3 edginfo rgma edguser monami
> > ADMINHOST NAME
> > RMCFG[base] TYPE=PBS TIMEOUT=30
> > SERVERPORT 40559
> > SERVERMODE NORMAL
> >
> > RMPOLLINTERVAL 00:02:00
> > LOGFILE /var/log/maui.log
> > LOGFILEMAXSIZE 50000000
> >
> > IDLEJOBDEPTH 300
> > #This come from a patch
> > #http://www.supercluster.org/pipermail/mauiusers/2009-February/003746.html
> >
> >
> >
> > BACKFILLPOLICY NONE
> > BACKFILLDEPTH 1
> > LOGLEVEL 1
> >
> > LOGFILEROLLDEPTH 50
> >
> > ENABLENEGJOBPRIORITY true
> > REJECTNEGPRIOJOBS false
> >
> > QUEUETIMEWEIGHT 0
> >
> > XFACTORWEIGHT 0
> >
> >
> > CREDWEIGHT 1
> > GROUPWEIGHT 1
> > USERWEIGHT 1
> > CLASSWEIGHT 1
> >
> > NODEALLOCATIONPOLICY CPULOAD
> >
> > DEFERTIME 00:00:00
> >
> > CLASSCFG[long] MAXPROC=100
> > CLASSCFG[medium] MAXPROC=100
> > GROUPCFG[dteam] MAXPROC=40 PRIORITY=10
> > GROUPCFG[dtsgm] MAXPROC=2 PRIORITY=100000
> > GROUPCFG[dtprd] MAXPROC=20 PRIORITY=100000
> > GROUPCFG[ops] MAXPROC=20 PRIORITY=100000
> > GROUPCFG[pilotops] MAXPROC=20 PRIORITY=100000
> > USERCFG[arnaubria] PRIORITY=1000
> >
> > SRCFG[picsgm_64]
> > GROUPLIST=atsgm,sgmcm,lhsgm,masgm,ctasgm,dtsgm,misgm,pasgm,picvosgm,sgmibergrid
> > SRCFG[picsgm_64] RESOURCES=PROCS:4
> > SRCFG[picsgm_64] PRIORITY=1000
> > SRCFG[picsgm_64] HOSTLIST=tditaller021
> > SRCFG[picsgm_64] STARTTIME=0:00:00 ENDTIME=24:00:00
> > SRCFG[picsgm_64] PERIOD=INFINITY
> >
> > FSWEIGHT 1
> > FSUSERWEIGHT 2
> > FSGROUPWEIGHT 10
> > FSQOSWEIGHT 100
> >
> > FSDEPTH 4
> > FSINTERVAL 12:00:00
> > FSDECAY 0.5
> > FSPOLICY DEDICATEDPS%
> >
> >
> >
> > GROUPCFG[masgm] FSTARGET=10 QDEF=magic MAXPROC=2
> > GROUPCFG[maprd] FSTARGET=10 QDEF=magic
> > GROUPCFG[magic] FSTARGET=10 QDEF=magic
> > QOSCFG[magic] FSTARGET=5.79
> > [....]
> >
> > OTHER QOS CONF
> > [...]
> >
> >
> what does a diagnose -p report?
I dont' have jobs running and my testing nodes are down (except my
torque-test server).
But I can tell you that my jobs where on top (I'm arnaubria user, so my
prio is 100000.... )
> Is it possible that the jobs which are running before your highest
> priority job are not being backfilled but having a higher priority
> instead due to the weights of the other metrics?
> I see that the CREDWEIGHT is set to 1 while QOS for example is set to
> 100.
No, that's impossible.
Other users prio are based on FS. Their prio go from negative values to
a prio of 200... I've never seen a prio superior to that.
> Also there are some groups with priority really high ( 100000)
Those are very special groups (not the ones casuing problems) and
myself.
Let me try to quick reproduce a case in my prod cluster.
I'll come back in a few.
Cheers and thans for your replies,
Arnau
More information about the mauiusers
mailing list