[Mauiusers] Node Sets and vanishing Features
eirik.thorsnes at bccs.uib.no
Tue May 13 11:51:35 MDT 2008
Ansgar Esztermann wrote:
> Hello everyone,
> I have a problem here with node sets; if anyone could advise me as to
> how to investigate further, I'd be grateful.
> The cluster in question consists of nodes with pair-wise Infiniband
> connections. Thus, node001 is connected to node002, but jobs spanning
> two nodes should not be started on, say, node001 and node005.
> In order to accomplish this, I've used Node Sets as suggested in the
> Maui Administrator's Guide, Chapter 8.3:
>> NODESETPOLICY ONEOF
>> NODESETATTRIBUTE FEATURE
>> NODESETDELAY 7:00:00:00
>> NODESETLIST twin01 twin02 twin03 twin04 twin05 twin06 twin07 twin08
> I've then grouped the nodes into pairs ("twins") like this:
>> NODECFG[node001] FEATURES=twin01
>> NODECFG[node002] FEATURES=twin01
>> NODECFG[node003] FEATURES=twin02
> and so on.
> Now, when I restart maui, all nodes have the features [all] and
> [twinXX], as shown by diagnose -n and checknode. As many queued jobs as
> possible are scheduled and started, just as I would expect.
> Then, something curious happens: the [twinXX] feature simply vanishes,
> and no further jobs are started even when the current batch completes
> Any idea on where I should look to solve this problem?
this may be related to a similar limitation in Moab. Have a look in the
include directory of the maui source for setting of maximum number of
classes. As I understand it, maui/moab uses classes for other purposes
than the intuitive "CLASSCFG" and when you run out it will silently drop
some of the config. Moab has a --with-max-classes config that changes this.
Eirik Thorsnes - System Engineer http://www.bccs.uib.no
Parallab, Bergen Center for Computational Science, Unifob
Høyteknologisenteret, Thormøhlensgate 55, N-5008 Bergen, Norway
tel: (+47) 555 84153 fax: (+47) 555 84295
More information about the mauiusers