[Mauiusers] insufficient idle procs available ?

Itay M itaym.tau at gmail.com
Sun Feb 24 11:15:33 MST 2008


I'm happy to say that the problem is solved!
Indeed the core of the problem was that certain jobs, that were submitted by
one of our research groups, created a higher load average than they should
have. For example, in a node that has 4 CPUs, two jobs running on it made
it's load average go to 3.8 or even higher. That resulted in PBS indicating
that this particular node is already busy, and that it cannot accept more
jobs.
Therefor, we could not use many free CPUs, because some nodes were reported
as 'busy' while they had free CPUs doing nothing at the moment.

The problem was solved by increasing the ideal_load and max_load  values on
each node. The '...has more processors utilized than dedicated'  error that
appeared while using the diagnose -n command, disappeared.  Therefor, the
problem was resolved.

Special thanks for Jan Ploski for his tremendous help and his involvement in
analyzing this problem. Thanks Jan!

Itay M.
On Sun, Feb 24, 2008 at 5:57 AM, Chris Samuel <csamuel at vpac.org> wrote:

>
> ----- "Itay M" <itaym.tau at gmail.com> wrote:
>
> > I'm not sure if they are multithreaded (needs further checking with
> > the developers) - but you're right. The load should be no more than 2
> > for 2 jobs, but infact its >2 . The jobs are C++ compiled with g++
> > compiler. Maybe a compilation switch will help with reducing the load
> > average to 1 per job?
>
> There are other ways to get a high load average on
> a node, for instance I/O intensive jobs that cause
> other processes to block in the D state waiting for
> the disks will do it.
>
> cheers,
> Chris
> --
> Christopher Samuel - (03) 9925 4751 - Systems Manager
>  The Victorian Partnership for Advanced Computing
>  P.O. Box 201, Carlton South, VIC 3053, Australia
> VPAC is a not-for-profit Registered Research Agency
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20080224/dacddc24/attachment.html


More information about the mauiusers mailing list