[Mauiusers] torque, maui out of sync

Steve Traylen s.traylen at rl.ac.uk
Mon Nov 21 08:54:42 MST 2005


On Sat, Nov 19, 2005 at 02:05:14PM -0800 or thereabouts, Garrick Staples wrote:
> On Fri, Nov 18, 2005 at 09:05:05AM -0700, Austin Godber alleged:
> > Chris Samuel wrote:
> > >>Any idea why maui wouldn't recognize a queue defined and enabled in 
> > >>torque?
> > >
> > >
> > ><clutching_at_straws>
> > >
> > >Have you restarted Maui since you created them ?
> > >
> > ></clutching_at_straws>
> > 
> > Yes, I did.  But that raises a good question.  For every change I make 
> > in torque, do I have to restart maui?  Maui doesn't get updated if I add 
> > a queue or just change a single setting.  It looks like maui doesn't see 
> > the change.  Is there a way to force it to do so, without completely 
> > restarting maui?
> 
> Maui _should_ also see changes every scheduling interval.  Anything else
> is a bug and should be fixed.

I've always assumed a restart of maui was needed when 

  1. Deleting or adding nodes.
  2. Modifying a Q parameter such as default walltime.

A month or so ago we deleted about 30 nodes from pbs and forgot
a maui restart.

This resulted in a complete mess where some batch workers ended
up running extras jobs that the pbs_server had no record off and
the pbs_server was convinced that the jobs were running on nodes
where they had never even appeared in pbs_mom logs.

I had assumed this was just a feature and a mistake from us that
we did not restart maui. From your comments it suggests that this
was not meant to be the case.

Certainly if you delete a node in it still appears in a 
`diagnose -n`.

  Steve



> 
>  
> > I believe the problem I had originally was somehow related to my use of
> > 	CLASSCFG[classname]	MAXPROC=80
> > I believe that the queues that weren't working didn't have CLASSCFG 
> > statements in my maui.cfg.  Does that make sense?
> 
> Yes, I always make sure my execution queues have CLASSCFG lines, just to
> make sure they are defined in maui before jobs start flowing in.  I
> think there's a bug or two floating in there that I haven't explored
> yet.
> 
>  
> > Actually, why doesn't showconfig show settings made in maui.cfg?  It 
> > clearly honors them because I can see when jobs hit soft and hard 
> > limits.  But showconfig only ever shows settings it picks up from torque.
> 
> showconfig hardly shows anything that is actually valid for CLASSCFG.
> QOS flags, job flags, etc. are all missing from the output.  'diagnose
> -c' does a better job.
> 
>  
> > I have fixed the queues I first mentioned by completely removing the 
> > CLASSCFG settings from my maui.cfg but I still have a queue that maui 
> > doesn't see.
> > 
> > and a diagnose -c does not show the queue at all.  But for jobs 
> > submitted to this queue the maui.log says:
> 
> I think you have hit the default limit on the number of classes.  See
> MAX_MCLASS in include/msched-common.h and
> http://www.clusterresources.com/products/maui/docs/a.ddevelopment.shtml
> 
> 
> -- 
> Garrick Staples, Linux/HPCC Administrator
> University of Southern California



> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers


-- 
Steve Traylen
s.traylen at rl.ac.uk
http://www.gridpp.ac.uk/


More information about the mauiusers mailing list