[torqueusers] TORQUE 4.0 and hwloc

Gus Correa gus at ldeo.columbia.edu
Wed Apr 4 09:50:43 MDT 2012


Hi David

Not to hijack Steven's thread ...
... but just taking a quick ride on it ... :)

Does the hwloc 1.1 requirement apply only to Torque 4.0?
How about the older Torque series [2.X.Y, 3.X.Y]
that use cpuset?
[I am in the process of building 2.4.16 with cpuset.]

Thank you,
Gus Correa

On 04/04/2012 10:59 AM, David Beer wrote:
> Steven,
>
> I was supposed to add that note and I forgot - my mistake and thanks for
> catching it. I have now added:
>
> *** For admins that use cpusets in any form ***
> hwloc version 1.1 or greater is now required for building TORQUE with
> cpusets, as pbs_mom now uses the
> hwloc API to create the cpusets instead of creating them manually.
>
> to README.building_40.
>
> As far as checking for the existence of the library, this does happen at
> configure time once the configure script determines that the user is
> going to be using cpusets in any way, which a few different configure
> options can trigger.
>
> David
>
> On Tue, Apr 3, 2012 at 8:15 PM, DuChene, StevenX A
> <stevenx.a.duchene at intel.com <mailto:stevenx.a.duchene at intel.com>> wrote:
>
>     I installed hwloc-1.4.1 and hwloc-devel-1.4.1 rpms on the server
>     where I am building torque-4.X and in looking through the output
>     from the configure script during the build I do not see anywhere
>     that the existence of any hwloc stuff is checked. In fact in
>     grepping through the output from the whole torque rpm build process
>     I do not see ANY mention of hwloc at all.____
>
>     __ __
>
>     I see compile time flags of HWLOC_CFLAGS and HWLOC_LIBS mentioned in
>     the –help output from configure but according to the description
>     text this is just supposed to over-ride the pkg-config results
>     however I do not see any evidence that the pkg-config system is
>     being quizzed at all for the existence of hwloc on the build server.____
>
>     __ __
>
>     Is there some step I am missing?____
>
>     __ __
>
>     I thought someone mentioned that there would be better documentation
>     of the hwloc business in the torque-4.0.1 release?____
>
>     __ __
>
>     If so where is it?____
>
>     --____
>
>     Steven DuChene____
>
>     __ __
>
>     *From:*torqueusers-bounces at supercluster.org
>     <mailto:torqueusers-bounces at supercluster.org>
>     [mailto:torqueusers-bounces at supercluster.org
>     <mailto:torqueusers-bounces at supercluster.org>] *On Behalf Of *David Beer
>     *Sent:* Monday, March 19, 2012 8:54 AM
>     *To:* Torque Users Mailing List
>     *Subject:* Re: [torqueusers] TORQUE 4.0 Officially Announced____
>
>     __ __
>
>     Steve,____
>
>     __ __
>
>     Hwloc is now required for running cpusets in TORQUE, and it helps
>     out a lot both in immediate use and in groundwork for future
>     features.____
>
>     __ __
>
>     Immediately hwloc gives you a better cpuset because it gives you the
>     next core instead of the next indexed core. For example: many eight
>     core systems have processors 0, 2, 4, and 6 next to each other and
>     processors 1, 3,  5, and 7 next to each other. If you're running a
>     pre-4.0 TORQUE, and you have two jobs on the node, each with 4
>     cores, job 1 will have 0-3 and job 2 will have 4-7. In TORQUE 4.0,
>     job 1 will have 0, 2, 4, and 6, and job 2 will have 1, 3, 5, and 7.
>     This should help speed up processing times for jobs (NOTE: only if
>     you have this kind of system and a comparable job layout, I'm not
>     promising a general speed-up to everyone using cpusets). This should
>     also allow us to properly handle hyperthreading for anyone that has
>     it turned on and wishes to use it.____
>
>     __ __
>
>     The last immediate feature is if you have SMT (simultaneous
>     multi-threading) hardware. The mom config variable $use_smt was
>     added. By default, the use of SMT is enabled, but you can tell your
>     pbs_mom to ignore them (not place them in the cpuset) using by
>     adding____
>
>     __ __
>
>     $use_smt false____
>
>     __ __
>
>     to your mom config file____
>
>     __ __
>
>     For the future, the hwloc threads make it really easy for us to
>     handle hardware specific requests. One of the coming features for
>     TORQUE is to allow requests roughly similar to:____
>
>     __ __
>
>     socket=2:numa=2 --with-hyperthreads____
>
>     __ __
>
>     which would say to spread the job over 2 sockets, and across the 2
>     numa nodes on each socket. This is a feature we plan to add to
>     improve support for Magny-Cours and Opteron type processors that
>     have multiple sockets and or multiple numa nodes on the processor
>     chip. Using hwloc makes it so we don't have to parse system files
>     and map the indices to the sockets and/or numa nodes ourselves, we
>     can simply use easy hwloc functions
>     like hwloc_get_next_obj_inside_cpuset_by_type() that allow you to
>     just move on to the next physical core or virtual core, or skip to
>     the next socket or numa node as the case may be.____
>
>     __ __
>
>     David____
>
>     On Mon, Mar 19, 2012 at 8:47 AM, DuChene, StevenX A
>     <stevenx.a.duchene at intel.com <mailto:stevenx.a.duchene at intel.com>>
>     wrote:____
>
>     Also a better (more complete) explanation of what features are
>     enabled when hwloc is used would be helpful as well.
>
>     BTW, I built torque on my server without hwloc installed and then
>     installed the resulting mom packages on my nodes. The mom daemons in
>     that case did seem to start up just fine.
>     --
>     Steven DuChene____
>
>
>     -----Original Message-----
>     From: torqueusers-bounces at supercluster.org
>     <mailto:torqueusers-bounces at supercluster.org>
>     [mailto:torqueusers-bounces at supercluster.org
>     <mailto:torqueusers-bounces at supercluster.org>] On Behalf Of Craig West
>     Sent: Sunday, March 18, 2012 10:40 PM
>     To: Torque Users mailing list; Torque Developers mailing list____
>
>     Subject: Re: [torqueusers] TORQUE 4.0 Officially Announced
>
>
>     Hi Steven,
>
>     I have just begun testing Torque 4.0, as hwloc has been a long awaited
>     feature for me.
>
>      > It is unclear from this announcement text where hwloc has to be
>     installed.
>      > Is it just on the server or on the nodes only?
>
>     It needs to be available on the BUILD server and the nodes. I tried to
>     run pbs_mom on a node without the hwloc I had installed and it failed.
>
>     Note: I am running hwloc 1.4 from a directory in /usr/local
>     This was not automatically found by the TORQUE configure script, but you
>     can specify the location using HWLOC_CFLAGS & HWLOC_LIBS.
>     It embeds the locations that you specify in the pbs_mom (and other
>     files) but it seems you can set the LD_LIBRARY_PATH variable if it is
>     not in the same location on the BUILD server as the compute nodes.
>     For simplicity installing them in the same location makes sense.
>
>      > More documentation about this would be greatly appreciated.
>
>     I agree, clearer and more detailed documentation would be useful.
>
>     Cheers,
>     Craig.
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers____
>
>
>
>     ____
>
>     __ __
>
>     -- ____
>
>     David Beer | Software Engineer____
>
>     Adaptive Computing____
>
>     __ __
>
>
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>
>
> --
> David Beer | Software Engineer
> Adaptive Computing
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list