[Mauiusers] Node Sets and vanishing Features
Garrick Staples
garrick at usc.edu
Tue May 13 14:23:50 MDT 2008
On Fri, May 09, 2008 at 04:30:43PM +0200, Ansgar Esztermann alleged:
> Hello everyone,
>
>
> I have a problem here with node sets; if anyone could advise me as to
> how to investigate further, I'd be grateful.
>
> The cluster in question consists of nodes with pair-wise Infiniband
> connections. Thus, node001 is connected to node002, but jobs spanning
> two nodes should not be started on, say, node001 and node005.
> In order to accomplish this, I've used Node Sets as suggested in the
> Maui Administrator's Guide, Chapter 8.3:
> >NODESETPOLICY ONEOF
> >NODESETATTRIBUTE FEATURE
> >NODESETDELAY 7:00:00:00
> >NODESETLIST twin01 twin02 twin03 twin04 twin05 twin06 twin07 twin08
> >twin09
>
> I've then grouped the nodes into pairs ("twins") like this:
> >NODECFG[node001] FEATURES=twin01
> >NODECFG[node002] FEATURES=twin01
> >NODECFG[node003] FEATURES=twin02
> and so on.
>
> Now, when I restart maui, all nodes have the features [all] and
> [twinXX], as shown by diagnose -n and checknode. As many queued jobs
> as possible are scheduled and started, just as I would expect.
> Then, something curious happens: the [twinXX] feature simply vanishes,
> and no further jobs are started even when the current batch completes
> execution.
>
> Any idea on where I should look to solve this problem?
I use nodesets extensively in my cluster and they work very reliably, but I've
never put them in maui.cfg. I put them in torque's config. You also get the
benefit of torque being able to use the same features ("properties" in torque).
Simply add them in qmgr.
s n node001 properties = twin01
s n node002 properties = twin01
...
--
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20080513/a8b8f694/attachment.bin
More information about the mauiusers
mailing list