[Mauiusers] Routing Queues (was Re Large queues cause Maui to idle)

Tom Rudwick tomr at intrinsity.com
Tue Dec 16 17:06:32 MST 2008


I'm not sure if this breaks anything, but did you really mean to do this?

set queue sroute route_destinations = serial
set queue sroute route_destinations += serial

Tom


Steve Young wrote:
> hmmm.. strange I'm not really sure what else to look at.... looks like 
> you have everything set properly. Anyone else have any idea's?
>
> -Steve
>
> On Dec 16, 2008, at 11:44 AM, Michael Galloway wrote:
>
>> On Tue, Dec 16, 2008 at 11:03:46AM -0500, Steve Young wrote:
>>> Hmm so to clarify.... if you submit to the serial queue the jobs run as
>>> expected. It's only when they come from the routing queue that they get
>>> placed on hold by maui?
>>
>> yes, that is the behavior i am seeing.
>>
>>>
>>> I'm still wondering about this:
>>>
>>> Flags:       HOSTLIST RESTARTABLE
>>> HostList:
>>> [c0-70:1]
>>>
>>> From your output's the checkjob shows the above. I don't see that on my
>>> jobs unless someone puts something like #PBS -l host=<hostname> in 
>>> their
>>> batch files. The job wants to go to c0-70... does this node have a 
>>> feature
>>> of serial in the server_priv/nodes file? If not then that would 
>>> explain why
>>> it can't run the job.
>>>
>>
>> the way i did the queues is i added properties to the nodes in the 
>> nodes file:
>>
>> c0-59 np=4 serial
>> c0-60 np=4 serial
>> c0-61 np=4 serial
>> c0-62 np=4 serial
>> c0-63 np=4 serial
>> c0-64 np=4 serial
>> c0-65 np=4 serial
>> c0-66 np=4 serial
>> c0-67 np=4 serial
>> c0-68 np=4 serial
>> c0-69 np=2 serial
>> c0-70 np=2 serial
>>
>> and from pbsnodes:
>>
>> c0-70
>>     state = free
>>     np = 2
>>     properties = serial
>>     ntype = cluster
>>     status = opsys=linux,uname=Linux compute-0-70.local 
>> 2.6.9-42.0.2.ELsmp #1 SMP Wed Aug 23 13:38:27 BST 2006 
>> x86_64,sessions=31141 31393 5457 15350 
>> 27161,nsessions=5,nusers=1,idletime=9506117,totmem=18521548kb,availmem=18280708kb,physmem=16425076kb,ncpus=2,loadave=0.00,netload=169380204972,state=free,jobs=? 
>> 0,rectime=1229443686
>>
>>
>> the script headers from the user are simply:
>>
>> #!/bin/csh -f
>> #PBS -N dock6VS
>> #PBS -q sroute
>> #PBS -l nodes=1:ppn=1
>> #PBS -j oe
>> #PBS -l cput=1:00:00
>> #PBS -V
>>
>>
>>
>>
>>> -Steve
>>>
>>> On Dec 16, 2008, at 10:43 AM, Michael Galloway wrote:
>>>
>>>> nope :-\
>>>>
>>>> -- michael
>>>>
>>>> On Tue, Dec 16, 2008 at 10:39:24AM -0500, Steve Young wrote:
>>>>> Does releasehold <jobid> make the held jobs start too?
>>>>>
>>>>> -Steve
>>>>>
>>>>> On Dec 16, 2008, at 10:36 AM, Michael Galloway wrote:
>>>>>
>>>>>> On Tue, Dec 16, 2008 at 10:09:29AM -0500, Michael Galloway wrote:
>>>>>>>
>>>>>>>
>>>>>>> the jobs in the serial queue are runnable if i just run them 
>>>>>>> manually
>>>>>>> with qrun.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> indeed, if i qrun several of the jobs that are queued, and free 
>>>>>> up some
>>>>>> slots in the queue,
>>>>>> my simple submission into the serial queue works as normal.
>>>>>>
>>>>>> -- michael
>>>>>>
>>>>>
>>>>>
>>>
>>>
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers



More information about the mauiusers mailing list