[torqueusers] TORQUE 4.0 and hwloc
David Beer
dbeer at adaptivecomputing.com
Wed Apr 4 09:52:42 MDT 2012
On Wed, Apr 4, 2012 at 9:50 AM, Gus Correa <gus at ldeo.columbia.edu> wrote:
> Hi David
>
> Not to hijack Steven's thread ...
> ... but just taking a quick ride on it ... :)
>
> Does the hwloc 1.1 requirement apply only to Torque 4.0?
> How about the older Torque series [2.X.Y, 3.X.Y]
> that use cpuset?
> [I am in the process of building 2.4.16 with cpuset.]
>
>
This only applies to 4.0 and higher.
> Thank you,
> Gus Correa
>
> On 04/04/2012 10:59 AM, David Beer wrote:
> > Steven,
> >
> > I was supposed to add that note and I forgot - my mistake and thanks for
> > catching it. I have now added:
> >
> > *** For admins that use cpusets in any form ***
> > hwloc version 1.1 or greater is now required for building TORQUE with
> > cpusets, as pbs_mom now uses the
> > hwloc API to create the cpusets instead of creating them manually.
> >
> > to README.building_40.
> >
> > As far as checking for the existence of the library, this does happen at
> > configure time once the configure script determines that the user is
> > going to be using cpusets in any way, which a few different configure
> > options can trigger.
> >
> > David
> >
> > On Tue, Apr 3, 2012 at 8:15 PM, DuChene, StevenX A
> > <stevenx.a.duchene at intel.com <mailto:stevenx.a.duchene at intel.com>>
> wrote:
> >
> > I installed hwloc-1.4.1 and hwloc-devel-1.4.1 rpms on the server
> > where I am building torque-4.X and in looking through the output
> > from the configure script during the build I do not see anywhere
> > that the existence of any hwloc stuff is checked. In fact in
> > grepping through the output from the whole torque rpm build process
> > I do not see ANY mention of hwloc at all.____
> >
> > __ __
> >
> > I see compile time flags of HWLOC_CFLAGS and HWLOC_LIBS mentioned in
> > the –help output from configure but according to the description
> > text this is just supposed to over-ride the pkg-config results
> > however I do not see any evidence that the pkg-config system is
> > being quizzed at all for the existence of hwloc on the build
> server.____
> >
> > __ __
> >
> > Is there some step I am missing?____
> >
> > __ __
> >
> > I thought someone mentioned that there would be better documentation
> > of the hwloc business in the torque-4.0.1 release?____
> >
> > __ __
> >
> > If so where is it?____
> >
> > --____
> >
> > Steven DuChene____
> >
> > __ __
> >
> > *From:*torqueusers-bounces at supercluster.org
> > <mailto:torqueusers-bounces at supercluster.org>
> > [mailto:torqueusers-bounces at supercluster.org
> > <mailto:torqueusers-bounces at supercluster.org>] *On Behalf Of *David
> Beer
> > *Sent:* Monday, March 19, 2012 8:54 AM
> > *To:* Torque Users Mailing List
> > *Subject:* Re: [torqueusers] TORQUE 4.0 Officially Announced____
> >
> > __ __
> >
> > Steve,____
> >
> > __ __
> >
> > Hwloc is now required for running cpusets in TORQUE, and it helps
> > out a lot both in immediate use and in groundwork for future
> > features.____
> >
> > __ __
> >
> > Immediately hwloc gives you a better cpuset because it gives you the
> > next core instead of the next indexed core. For example: many eight
> > core systems have processors 0, 2, 4, and 6 next to each other and
> > processors 1, 3, 5, and 7 next to each other. If you're running a
> > pre-4.0 TORQUE, and you have two jobs on the node, each with 4
> > cores, job 1 will have 0-3 and job 2 will have 4-7. In TORQUE 4.0,
> > job 1 will have 0, 2, 4, and 6, and job 2 will have 1, 3, 5, and 7.
> > This should help speed up processing times for jobs (NOTE: only if
> > you have this kind of system and a comparable job layout, I'm not
> > promising a general speed-up to everyone using cpusets). This should
> > also allow us to properly handle hyperthreading for anyone that has
> > it turned on and wishes to use it.____
> >
> > __ __
> >
> > The last immediate feature is if you have SMT (simultaneous
> > multi-threading) hardware. The mom config variable $use_smt was
> > added. By default, the use of SMT is enabled, but you can tell your
> > pbs_mom to ignore them (not place them in the cpuset) using by
> > adding____
> >
> > __ __
> >
> > $use_smt false____
> >
> > __ __
> >
> > to your mom config file____
> >
> > __ __
> >
> > For the future, the hwloc threads make it really easy for us to
> > handle hardware specific requests. One of the coming features for
> > TORQUE is to allow requests roughly similar to:____
> >
> > __ __
> >
> > socket=2:numa=2 --with-hyperthreads____
> >
> > __ __
> >
> > which would say to spread the job over 2 sockets, and across the 2
> > numa nodes on each socket. This is a feature we plan to add to
> > improve support for Magny-Cours and Opteron type processors that
> > have multiple sockets and or multiple numa nodes on the processor
> > chip. Using hwloc makes it so we don't have to parse system files
> > and map the indices to the sockets and/or numa nodes ourselves, we
> > can simply use easy hwloc functions
> > like hwloc_get_next_obj_inside_cpuset_by_type() that allow you to
> > just move on to the next physical core or virtual core, or skip to
> > the next socket or numa node as the case may be.____
> >
> > __ __
> >
> > David____
> >
> > On Mon, Mar 19, 2012 at 8:47 AM, DuChene, StevenX A
> > <stevenx.a.duchene at intel.com <mailto:stevenx.a.duchene at intel.com>>
> > wrote:____
> >
> > Also a better (more complete) explanation of what features are
> > enabled when hwloc is used would be helpful as well.
> >
> > BTW, I built torque on my server without hwloc installed and then
> > installed the resulting mom packages on my nodes. The mom daemons in
> > that case did seem to start up just fine.
> > --
> > Steven DuChene____
> >
> >
> > -----Original Message-----
> > From: torqueusers-bounces at supercluster.org
> > <mailto:torqueusers-bounces at supercluster.org>
> > [mailto:torqueusers-bounces at supercluster.org
> > <mailto:torqueusers-bounces at supercluster.org>] On Behalf Of Craig
> West
> > Sent: Sunday, March 18, 2012 10:40 PM
> > To: Torque Users mailing list; Torque Developers mailing list____
> >
> > Subject: Re: [torqueusers] TORQUE 4.0 Officially Announced
> >
> >
> > Hi Steven,
> >
> > I have just begun testing Torque 4.0, as hwloc has been a long
> awaited
> > feature for me.
> >
> > > It is unclear from this announcement text where hwloc has to be
> > installed.
> > > Is it just on the server or on the nodes only?
> >
> > It needs to be available on the BUILD server and the nodes. I tried
> to
> > run pbs_mom on a node without the hwloc I had installed and it
> failed.
> >
> > Note: I am running hwloc 1.4 from a directory in /usr/local
> > This was not automatically found by the TORQUE configure script, but
> you
> > can specify the location using HWLOC_CFLAGS & HWLOC_LIBS.
> > It embeds the locations that you specify in the pbs_mom (and other
> > files) but it seems you can set the LD_LIBRARY_PATH variable if it is
> > not in the same location on the BUILD server as the compute nodes.
> > For simplicity installing them in the same location makes sense.
> >
> > > More documentation about this would be greatly appreciated.
> >
> > I agree, clearer and more detailed documentation would be useful.
> >
> > Cheers,
> > Craig.
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > http://www.supercluster.org/mailman/listinfo/torqueusers____
> >
> >
> >
> > ____
> >
> > __ __
> >
> > -- ____
> >
> > David Beer | Software Engineer____
> >
> > Adaptive Computing____
> >
> > __ __
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> > --
> > David Beer | Software Engineer
> > Adaptive Computing
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
--
David Beer | Software Engineer
Adaptive Computing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120404/63198595/attachment.html
More information about the torqueusers
mailing list