[Mauiusers] insufficient idle procs available ?

Jan Ploski Jan.Ploski at offis.de
Tue Jan 29 11:13:38 MST 2008

Itay M wrote:
> We've tried the new configuration (unset resources_default.ncpus and 
> unset  resources_max.ncpus; from from queues and server levels as well) 
> in the last few days and here are the results:

I suppose you did check with qstat -f that 'ncpus' is not mentioned 
anywhere any longer?

> *  For the first time we were able to see that jobs are backfilled! It 
> never happend before, and this is a major improvment. Though we saw it 
> only in one of our queues (named 'b_que') it might have happend in other 
> queues as well (we couln'd verify it yet).
> * But - the  'insufficient idle procs available' problem is still there. 
> For example, at the moment showq shows that there are plenty of non-busy 
> processors ('65 of   84 Processors Active'), but checkjob says for 
> queued jobs that:
> checking job 228665
> State: Idle
> Creds:  user:b group:b   class:b_que  qos:hi
> WallTime: 00:00:00 of 00:05:00
> SubmitTime: Tue Jan 29 19:47:04
>   (Time Queued  Total: 00:07:49  Eligible: 00:07:16)
> Total Tasks: 1
> Req[0]  TaskCount: 1  Partition: ALL
> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
> Dedicated Resources Per Task: PROCS: 1  MEM: 512M
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 0
> PartitionMask: [ALL]
> Flags:       RESTARTABLE
> PE:  1.00  StartPriority:  1007
> job cannot run in partition DEFAULT (idle procs do not meet requirements 
> : 0 of 1 procs found)
> idle procs:  12  feasible procs:   0
 > :(
 > What should I check next?

Maybe it has something to do with the MEM requirement (just a wild 
guess... but try removing it). What does diagnose -n say for a node 
which is incorrectly rejecting the job? Does it have enough free 
"tokens" (not sure if this is what they are called officially) to run 
the job in this b_que class?

Jan Ploski

More information about the mauiusers mailing list