[torqueusers] forcing maui to honour size=free:total filesystem definition and -l file= qsub parameter

David Jackson jacksond at clusterresources.com
Sat Mar 5 14:57:03 MST 2005


Marcin,

  Sorry for the delayed response.  You changes are now in the latest
Maui 3.2.6p12 snapshot.  Your question about 'renewable' resources may
be pointing at an issue.  If TORQUE reports <AVAIL>:<CONFIG> resources,
then, at your site, <CONFIG> should always be 80 GB on each compute
node.  Is this what you are seeing?  Maui should only place a batch hold
on a job if the configured resources never satisfy the job.

  It may be that Maui is getting confused.  If resources are in use but
are not dedicated to any job, Maui may determine that the job cannot
start right away due to resource availability and will attempt to create
a reservation.  However, because no job has these used resources
dedicated, Maui will believe that the resources will free up right away
and attempt to schedule the job in the near future.  When it determines
that this is not possible, I believes that there is something wrong with
the job and will attempt to defer it in the hope that the problem will
clear up.

  The 'defer' behavior will actually perform the needed function but in
your case, you need to give it more time.  Set DEFERCOUNT to a larger
number or turn it off completely and your jobs should again run.  Maui
does need a bit more intelligence to make certain that if resources are
not available only because of resource utilization and not because of
resource dedication, then Maui should not attempt to reserve resources
for some configurable period of time, say 10 minutes, to allow the
resource consumption to change.  We will integrate the changes into
Maui.

Dave 



On Fri, 2005-02-04 at 15:04 +0100, Marcin Mogielnicki wrote:
> I think maui ignores torque filesystem size and file parameter of 
> jobs.First of all - torque's mom configured with size[fs=/tmpx] 
> parameter exports it properly, but maui doesn't parse it, while it 
> parses many other variables exported in the same way. Weird, but not too 
> problematic:
> 
> --- maui-3.2.6p11/src/moab/MPBSI.c      2004-12-03 21:41:24.000000000 +0100
> +++ maui-3.2.6p11.modified/src/moab/MPBSI.c     2005-02-04 
> 14:37:34.421202143 +0100
> @@ -5974,6 +5975,16 @@ int __MPBSIGetSSSStatus(
>         {
>         N->CRes.Mem = (MPBSGetResKVal(Value) >> 10);
>         }
> +    else if (!strcmp(Name,"size"))
> +      {
> +      char *ptr;
> +      char *tok;
> +      ptr = MUStrTok(Value,":",&tok);
> +      if ((tok != NULL) && (ptr != NULL)) {
> +       N->ARes.Disk = (MPBSGetResKVal(Value) >> 10);
> +       N->CRes.Disk = (MPBSGetResKVal(tok) >> 10);
> +       }
> +      }
>       else if (!strcmp(Name,"ncpus"))
>         {
>         N->CRes.Procs = (int)strtol(Value,NULL,10);
> 
> As you can see it simply parses size=x:y variable and sets 
> N(ode)->A(vailable)/C(onfigured)Res.Disk as free and total filesystem 
> size. If parsing goes wrong, it is silently ignored and not set. After 
> this patch checknode shows proper filesystem values.
> 
> Well, and now problems begin. Theoretically parsing file parameter is 
> very easy:
> 
> @@ -3755,10 +3755,11 @@ int MPBSJobUpdate(
>         else if (!strcmp(AP->resource,"file"))
>           {
>           tmpL = (MPBSGetResKVal(AP->value) >> 10);
> +       RQ->DRes.Disk = tmpL;
> 
>           if (tmpL != RQ->RequiredDisk)
>             {
> -          RQ->RequiredDisk = (MPBSGetResKVal(AP->value) >> 10);
> +          RQ->RequiredDisk = tmpL;
>             RQ->DiskCmp      = mcmpGE;
>             }
> 
> However I'm not sure whether RQ->D(edicated)Res is task-specific or 
> job-specific parameter, but I checked experimentally that it works that 
> way that every node assigned to the task must have at least requested 
> size of free space on every node assigned to the task and it is quite 
> enough for me.
> 
> And now my standard question - are the changes not breaking anything 
> else? I'd also like to ask any magician to tell me how to define disk 
> space as renewable resource. When given patches are submitted and the 
> job could start because of enough free tasks, but with not enough free 
> disk space on chosen nodes, the job gets batchhold immediately. I have 
> 100 dual-processor nodes in my cluster with local 80gb disk in every of 
> them and amount of free space is constantly changing, but maui assumes 
> that utilized space will never be freed.
> 
> 	Marcin Mogielnicki
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list