[torqueusers] forcing maui to honour size=free:total filesystem
definition and -l file= qsub parameter
Marcin Mogielnicki
mar_mog at o2.pl
Fri Feb 4 07:04:50 MST 2005
I think maui ignores torque filesystem size and file parameter of
jobs.First of all - torque's mom configured with size[fs=/tmpx]
parameter exports it properly, but maui doesn't parse it, while it
parses many other variables exported in the same way. Weird, but not too
problematic:
--- maui-3.2.6p11/src/moab/MPBSI.c 2004-12-03 21:41:24.000000000 +0100
+++ maui-3.2.6p11.modified/src/moab/MPBSI.c 2005-02-04
14:37:34.421202143 +0100
@@ -5974,6 +5975,16 @@ int __MPBSIGetSSSStatus(
{
N->CRes.Mem = (MPBSGetResKVal(Value) >> 10);
}
+ else if (!strcmp(Name,"size"))
+ {
+ char *ptr;
+ char *tok;
+ ptr = MUStrTok(Value,":",&tok);
+ if ((tok != NULL) && (ptr != NULL)) {
+ N->ARes.Disk = (MPBSGetResKVal(Value) >> 10);
+ N->CRes.Disk = (MPBSGetResKVal(tok) >> 10);
+ }
+ }
else if (!strcmp(Name,"ncpus"))
{
N->CRes.Procs = (int)strtol(Value,NULL,10);
As you can see it simply parses size=x:y variable and sets
N(ode)->A(vailable)/C(onfigured)Res.Disk as free and total filesystem
size. If parsing goes wrong, it is silently ignored and not set. After
this patch checknode shows proper filesystem values.
Well, and now problems begin. Theoretically parsing file parameter is
very easy:
@@ -3755,10 +3755,11 @@ int MPBSJobUpdate(
else if (!strcmp(AP->resource,"file"))
{
tmpL = (MPBSGetResKVal(AP->value) >> 10);
+ RQ->DRes.Disk = tmpL;
if (tmpL != RQ->RequiredDisk)
{
- RQ->RequiredDisk = (MPBSGetResKVal(AP->value) >> 10);
+ RQ->RequiredDisk = tmpL;
RQ->DiskCmp = mcmpGE;
}
However I'm not sure whether RQ->D(edicated)Res is task-specific or
job-specific parameter, but I checked experimentally that it works that
way that every node assigned to the task must have at least requested
size of free space on every node assigned to the task and it is quite
enough for me.
And now my standard question - are the changes not breaking anything
else? I'd also like to ask any magician to tell me how to define disk
space as renewable resource. When given patches are submitted and the
job could start because of enough free tasks, but with not enough free
disk space on chosen nodes, the job gets batchhold immediately. I have
100 dual-processor nodes in my cluster with local 80gb disk in every of
them and amount of free space is constantly changing, but maui assumes
that utilized space will never be freed.
Marcin Mogielnicki
More information about the torqueusers
mailing list