[Mauiusers] Large queues cause Maui to idle

Garrick Staples garrick at usc.edu
Wed Dec 10 16:12:30 MST 2008


Funny that this came up today.  I've got a user that has submitted 17 thousand
jobs this morning.  The jobs have been happily dripping through the routing
queue into the execution queue. 

On Wed, Dec 10, 2008 at 05:45:03PM -0500, Steve Young alleged:
> Actually I think thanks should go to Garrick ... he's the one who got  
> me steered onto the right direction =).
> 
> -Steve
> 
> On Dec 10, 2008, at 5:36 PM, Nicholas Geraedts wrote:
> 
> >Thank you Steve! Your solution is exactly what we're looking for and  
> >seems to work quite well.
> >
> >Cheers,
> >-Nick
> >
> >
> >On Wed, Dec 10, 2008 at 12:57 PM, Steve Young <chemadm at hamilton.edu>  
> >wrote:
> >I've used a routing queue to solve this problem. The queue that the  
> >user is running on can only utilize 32 cpu's. The thousands of jobs  
> >are 1 cpu each. So I have this for a routing queue:
> >
> >create queue physics
> >set queue physics queue_type = Route
> >set queue physics acl_group_enable = True
> >set queue physics route_destinations += herc
> >set queue physics enabled = True
> >set queue physics started = True
> >
> >So jobs that go into here are moved to the herc execution queue.  
> >This queue has the following setting:
> >
> >set queue herc max_queuable = 36
> >
> >This way only 36 jobs at time can be queue'd from the routing queue.  
> >This way maui doesn't even have to worry about considering each of  
> >all the thousand's of jobs each iteration. It only has to worry  
> >about scheduling the jobs for the resources it has to run on.
> >
> >I also use MAXIJOB in maui:
> >
> >CLASSCFG[herc]          QLIST=md QDEF=md MAXIJOB=4
> >
> >This way even if a user had lots of jobs in the queue only their top  
> >4 idle jobs will get considered for scheduling. This way others will  
> >be able to get their jobs to run without having to wait for maui to  
> >process thousands of jobs that can't run yet anyhow.
> >
> >I hope this helps.
> >
> >-Steve
> >
> >
> >
> >
> >
> >On Dec 10, 2008, at 3:38 PM, Nicholas Geraedts wrote:
> >
> >Thanks Troy and Halvor. You were both correct about the MMAX_JOB  
> >definition in the .h file. I've increased it and asked the user to  
> >try to break the system again. I'll let you know how things go.
> >
> >Cheers,
> >-Nick
> >
> >_______________________________________________
> >mauiusers mailing list
> >mauiusers at supercluster.org
> >http://www.supercluster.org/mailman/listinfo/mauiusers
> >
> >
> 

> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

See the Dishonor Roll at http://www.californiansagainsthate.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20081210/eb32b6a6/attachment.bin


More information about the mauiusers mailing list