[Mauiusers] Re: maui not schedulling jobs in avaliable resources

Arnau Bria arnaubria at pic.es
Wed Feb 18 09:54:00 MST 2009


Hi,

After some investigation I found some more clues:

Maui seems to don't know how to proceed if it has some jobs at the top
of the queue with some Feature (ifae or slc4 in my case) and those jobs
cannot run, but it has many jobs with other feature down in teh queue
and they have free slots...

We solve this by setting MAX IDLE jobs per class... 1 job at this
momemt. With this set, we have our farm full, with out, jobs unable to
run block the entire farm.

anyone noticed this behaviour before?

Cheers,
Arnau



I paste here OP cause is quiet old:

On Wed, 11 Feb 2009 12:50:51 +0100
Arnau Bria wrote:

> Hi all,
> 
> I have several jobs in Idle state that could start running as they
> have avalible resources:
> # checkjob 2037088
> 
> 
> checking job 2037088
> 
> State: Idle
> Creds:  user:atprd035  group:atprd  class:glong64  qos:lhcatlas
> WallTime: 00:00:00 of 3:00:00:00
> SubmitTime: Wed Feb 11 10:51:53
>   (Time Queued  Total: 1:52:30  Eligible: 1:52:30)
> 
> StartDate: -1:51:05  Wed Feb 11 10:53:18
> Total Tasks: 1
> 
> Req[0]  TaskCount: 1  Partition: ALL
> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [slc4_x64]
> 
> 
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 0
> PartitionMask: [ALL]
> PE:  1.00  StartPriority:  55
> job can run in partition DEFAULT (276 procs available.  1 procs
> required)
> 
> 
> They're not at the beggingn of Idle queue, so seems that maui do not
> take care of them:
> 
> # showq|grep atprd035|grep Idle|awk {'print $1'}
> 2036560
> 2036701
> 2036702
> 2036703
> 2036704
> 2036705
> 2036707
> 2036708
> 2036709
> 2036713
> 2036714
> 2037088
> ^^^^^^^
> 2037089
> 2037090
> 2037099
> 2037100
> 2037101
> 2037102
> 2037103
> [...]
> 
> 
> Jobs at top IDle queue cannot run cause they have no nodes for do it:
> 
> # checkjob 2036560
> 
> 
> checking job 2036560
> 
> State: Idle
> Creds:  user:atprd035  group:atprd  class:glong  qos:lhcatlas
> WallTime: 00:00:00 of 3:00:00:00
> SubmitTime: Wed Feb 11 08:48:00
>   (Time Queued  Total: 4:00:25  Eligible: 4:00:25)
> 
> StartDate: -2:33:31  Wed Feb 11 10:14:54
> Total Tasks: 1
> 
> Req[0]  TaskCount: 1  Partition: ALL
> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [slc4]
> 
> 
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 104  StartCount: 0
> PartitionMask: [ALL]
> PE:  1.00  StartPriority:  55
> job cannot run in partition DEFAULT (idle procs do not meet
> requirements : 0 of 1 procs found)
> 
> so, how may I say maui to schedule and dispatch jobs from all queue?
> 
> TIA,
> Arnau
> 
> 
> 



More information about the mauiusers mailing list