[Mauiusers] Node Sets and vanishing Features

Garrick Staples garrick at usc.edu
Tue May 13 14:23:50 MDT 2008


On Fri, May 09, 2008 at 04:30:43PM +0200, Ansgar Esztermann alleged:
> Hello everyone,
> 
> 
> I have a problem here with node sets; if anyone could advise me as to  
> how to investigate further, I'd be grateful.
> 
> The cluster in question consists of nodes with pair-wise Infiniband  
> connections. Thus, node001 is connected to node002, but jobs spanning  
> two nodes should not be started on, say, node001 and node005.
> In order to accomplish this, I've used Node Sets as suggested in the  
> Maui Administrator's Guide, Chapter 8.3:
> >NODESETPOLICY ONEOF
> >NODESETATTRIBUTE FEATURE
> >NODESETDELAY 7:00:00:00
> >NODESETLIST twin01 twin02 twin03 twin04 twin05 twin06 twin07 twin08  
> >twin09
> 
> I've then grouped the nodes into pairs ("twins") like this:
> >NODECFG[node001] FEATURES=twin01
> >NODECFG[node002] FEATURES=twin01
> >NODECFG[node003] FEATURES=twin02
> and so on.
> 
> Now, when I restart maui, all nodes have the features [all] and  
> [twinXX], as shown by diagnose -n and checknode. As many queued jobs  
> as possible are scheduled and started, just as I would expect.
> Then, something curious happens: the [twinXX] feature simply vanishes,  
> and no further jobs are started even when the current batch completes  
> execution.
> 
> Any idea on where I should look to solve this problem?

I use nodesets extensively in my cluster and they work very reliably, but I've
never put them in maui.cfg.  I put them in torque's config.  You also get the
benefit of torque being able to use the same features ("properties" in torque).

Simply add them in qmgr.
  s n node001 properties = twin01
  s n node002 properties = twin01
  ...

-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20080513/a8b8f694/attachment.bin


More information about the mauiusers mailing list