Bill Wichser bill at Princeton.EDU
Wed Nov 24 14:51:27 MST 2010

I seem to be having deja vu all over again.

The environment is torque/maui.  There are a number of classes (queues) 
set up, each with a limit for MAXPROCS.  Lets take a class "A" and set a 
maxprocs limit of 100.  The queue is full and new jobs block as they now 
exceed this limit.

So far so good.

Further, there is a 50 core job in there waiting with what would be the 
highest priority if it were in the idle state.  But it remains blocked.  
Other jobs are also blocked by only want 8 cores.  As jobs complete, 
lets say they are 10 core and 8 core jobs, only these blocked 8 core 
jobs ever get scheduled.  The 50 core job always remains blocked because 
resources are never available.

Yes one could use the JOBAGGREGATION time to try and wait for more to 
finish in a timely manner but when this takes maybe 16 hours, it just 
isn't feasible.  Basically I have a blocked job which gets starved and 
no way to get it scheduled with the current mix of jobs coming in and 
moving through without manually disabling the "started" in qmgr or 
placing jobs on hold until resources become available.

Is there some way to handle this that I just don't know or have figured 
out?  I hope that I'm just missing something simple.

Thanks.  And for all the USofA folks, Happy Thanksgiving.


