[Mauiusers] cluster underused - single cpu jobs hold up parallel (fwd)

Lydia Heck lydia.heck at durham.ac.uk
Thu Mar 31 15:48:20 MDT 2011


I found the problem. For some reason or other two of the nodes refused to accept 
jobs. Instead that the scheduller would choose a different nodes, the jobs were 
deferred. Once I had disabled those two nodes, the jobs started.

I still do not know why the scheduller insisted of keeping on choosing the nodes
or why these nodes failed to communicated properly to the pbs_server, but at 
least the cluster i functioning alright now.

Lydia




  On Thu, 31 Mar 2011, Lydia Heck wrote:

>
>
> Our cluster has ~2,600 cores, there are parallel jobs running to fill ~1,700
> and there are many sequential jobs queue that are now in "front" of other
> parallel jobs. But only one or two of the sequential jobs are running, the
> others being held by the user.
>
> The parallel jobs are not schedulled. The scheduler is maui. Any idea what I
> am missing here?
>
> Lydia
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
>


More information about the mauiusers mailing list