[Mauiusers] Maui assigns too many resources

Roy Dragseth roy.dragseth at cc.uit.no
Thu Sep 8 10:06:11 MDT 2011


On Thursday 8. September 2011 17.59.52 Jim Kusznir wrote:
> This isn't quite the problem.  The problem is that even though a user
> requests 1 node, 1 PPN, and torque shows it as such, maui (through
> showq) shows this as needing 2 processors per node, and thereby has
> allocated 100% of the cluster's resources.  Even torque output shows
> that more resources have been assigned than the job requested (eg,
> "the scheduler messed up").
> 
> This only happens on this one users' jobs.  Restarting maui causes it
> to realize these jobs only needed one processor, and appropriately
> schedules the remaining jobs.
> 
> --Jim
> 
> On Thu, Sep 8, 2011 at 7:32 AM, Gus Correa <gus at ldeo.columbia.edu> wrote:
> > Jim Kusznir wrote:
> >> Hi all:
> >> 
> >> I've got a user who's creating a bunch of single-threaded jobs via
> >> script (about 250 at a shot).  All are specified (in torque) as -l
> >> nodes=1:ppn=1.  However, half of his jobs end up queued rather than
> >> running (he sizes his job to take the entire cluster).  When I look
> >> into why, checkjob shows that the resources allocated (2) exceeds
> >> requested (1), and showq shows that it assigned 2 cores per job, yet
> >> torque can't show that anywhere.  To fix, I restart maui, and it
> >> correctly sees that each job should only be 1 core and starts the rest
> >> of the jobs that were queued.  When jobs are in queue, showq shows
> >> them as requiring only one processor.
> >> 
> >> How can I fix this permanently?
> >> 
> >> maui 3.2.6p19 (as installed on a rocks cluster from the torque+maui
> >> roll, rocks 5.1)
> >> torque-2.3.0
> >> 
> >> Thanks!
> >> --Jim
> >> _______________________________________________
> >> mauiusers mailing list
> >> mauiusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/mauiusers
> > 
> > Hi Jim
> > 
> > Some guesses:
> > 
> > Look at your JOBNODEMATCHPOLICY in ${MAUI}/maui.cfg.
> > To pack multiple jobs on a node you could choose it to be EXACTPROC.
> > http://www.adaptivecomputing.com/resources/docs/maui/a.fparameters.php
> > 
> > Another thing to look at, is DEFERTIME.
> > The default is 1 hour.
> > You could set it to less.
> > For instance, if you want it to be one minute, add this line:
> > DEFERTIME 00:01:00
> > to your ${MAUI}/maui.cfg file and restart maui.
> > http://www.adaptivecomputing.com/resources/docs/maui/a.fparameters.php
> > 
> > I hope this helps,
> > Gus Correa
> 

Strange, I haven't seen this before even an old release as Rocks 5.1.  Could 
you post the output of qstat -f JOBID and checkjob JOBID?  

r.
-- 

  The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
	      phone:+47 77 64 41 07, fax:+47 77 64 41 00
        Roy Dragseth, Team Leader, High Performance Computing
	 Direct call: +47 77 64 62 56. email: roy.dragseth at uit.no


More information about the mauiusers mailing list