[Mauiusers] cleaning up "ncpus" mess

Alessandro Federico alessandro.federico at caspur.it
Mon Apr 3 10:38:25 MDT 2006


hi Garrick,

i'm trying your patch on my dual Opteron cluster and
it seems to work. i'm very happy because i plan to use
torque/maui on our IBM SMP cluster (8 cpus per node)
where i would like to request cpus (not nodes=X:ppn=Y)
and let the scheduler choose the nodes.
on the opteron cluster i discoverd that, if i submit
a job with -l ncpus=4, maui creates only one task
(with PROCS=4) and the job will never run because it
cannot find a node (with 4 cpus) to satisfy the task.
it seems a strange behaviour!

anyway, thank you very much. i will give you feed back
as soon as i will try your patch on the IBM.

bye
ale

PS: why did nobody reply you?

Garrick Staples wrote:
> I need some people to test this patch.  It attempts to make "nodes means
> nodes" and "ncpus means number of cpus", at least with EXACTNODE.
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --- maui-3.2.6p14_orig/src/moab/MPBSI.c	2006-01-09 15:24:08.000000000 -0800
> +++ maui-3.2.6p14/src/moab/MPBSI.c	2006-01-09 17:25:10.000000000 -0800
> @@ -5977,14 +5977,14 @@ int MPBSJobAdjustResources(
>          }
>        }    /* END if (R->U.PBS.PBS5IsEnabled == TRUE) */
>  
> -    if ((TA->NCPUs > 1) &&
> -       ((TA->NodesRequested > 1) || (RQ->TasksPerNode > 1)))
> +    if (TA->NCPUs > 1)
>        {
>        /* multi-node 'ncpu' specification detected */
>  
>        RQ->DRes.Procs    = 1;
>  
> -      RQ->TasksPerNode  = TA->NCPUs;
> +      if ((TA->NodesRequested > 1) || (RQ->TasksPerNode > 1))
> +        RQ->TasksPerNode  = TA->NCPUs;
>  
>        RQ->TaskCount     = TA->NCPUs;
>        J->TasksRequested = TA->NCPUs;
> @@ -6071,7 +6071,7 @@ int MPBSJobAdjustResources(
>  
>      RQ->TasksPerNode = MAX(0,RQ->TasksPerNode);
>   
> -    if ((J->ReqHList == NULL) && (MPar[0].JobNodeMatch & (1 << nmExactNodeMatch)))
> +    if ((J->ReqHList == NULL) && (MPar[0].JobNodeMatch & (1 << nmExactNodeMatch)) && (TA->NCPUs <= 1))
>        {
>        RQ->NodeCount = RQ->TaskCount / MAX(1,RQ->TasksPerNode);
>        }
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers

-- 
***************************************************
     Alessandro Federico
     CASPUR  -  http://www.caspur.it/

     e-mail:    alessandro.federico at caspur.it
     phone:     +39 06 44486708
     fax:       +39 06 4957083

---------------------------------------------------
 Military intelligence is a contradiction in terms.
                                    (Groucho Marx)
---------------------------------------------------


More information about the mauiusers mailing list