[torqueusers] RE: [Ganglia-developers] ANNOUNCE: Public release of Ganglia Job Monarch v0.1.0

Bernard Li bli at bcgsc.ca
Fri Mar 10 22:45:13 MST 2006

Hi Ramon:
Have you thought of supporting other workload management systems such as Sun Grid Engine (SGE)?


From: ganglia-developers-admin at lists.sourceforge.net on behalf of Ramon Bastiaans
Sent: Fri 10/03/2006 09:33
To: torqueusers at supercluster.org; Ganglia General
Cc: Ganglia Developers
Subject: [Ganglia-developers] ANNOUNCE: Public release of Ganglia Job Monarch v0.1.0

This is the first initial and public open source release of:

    "Ganglia Job Monarch", the Job Monitoring and Archiving tool and is
a addon to Ganglia.


This release is:  ganglia_jobmonarch-0.1.0

It is available here:  

See the INSTALL file on how to set it up.


    Job Monarch is a set of tools to monitor and optionally archive
(batch)job information.

    It is a addon for the Ganglia monitoring system and plugs in to a
existing Ganglia setup.
    To view a operational setup with Job Monarch, have a look here:

    Job Monarch stands for 'Job Monitoring and Archiving' tool and
consists of three (3) components:

    * jobmond

        The Job Monitoring Daemon.
        Gathers PBS/Torque batch statistics on jobs/nodes and submits
them into
        Ganglia's XML stream.

        Through this daemon, users are able to view the PBS/Torque batch
system and the
        jobs/nodes that are in it (be it either running or queued).

    * jobarchived (optionally)

        The Job Archiving Daemon.

        Listens to Ganglia's XML stream and archives the job and node
        It stores the job statistics in a Postgres SQL database and the
node statistics
        in RRD files.
        Through this daemon, users are able to lookup a old/finished job
        and view all it's statistics.

        Optionally: You can either choose to use this daemon if your
users have use for it.
        As it can be a heavy application to run and not everyone may
have a need for it.

        - Multithreaded:    Will not miss any data regardless of (slow)
        - Staged writing:    Spread load over bigger time periods
        - High precision RRDs:    Allow for zooming on old periods with
large precision
        - Timeperiod RRDs:    Allow for smaller number of files while
still keeping advantage of small disk space
    * web

        The Job Monarch web interface.

        This interfaces with the jobmond data and (optionally) the
jobarchived and presents the
        data and graphs.

        It does this in a similar layout/setup as Ganglia itself, so the
navigation and usage is intuitive.

        - Graphical usage:    Displays graphical cluster overview so you
can see the cluster (job) state
                    in one view/image and additional pie chart with
relevant information on your
                    current view
        - Filters:        Ability to filter output to limit information
displayed (usefull for those
                    clusters with 500+ jobs). This also filters the
graphical overview images output
                    and pie chart so you only see the filter relevant data
        - Archive:        When enabling jobarchived, users can go back
as far as recorded in the database
                    or archived RRDs to find out what happened to a
crashed or old job
        - Zoom ability:        Users can zoom into a timepriod as small
as the smallest grain of the RRDS
                    (typically up to 10 seconds) when a jobarchived is


You can view a operational Ganglia Job Monarch setup here:


Any information/suggestions/hatemail/bugreports/whatever to:

    Ramon Bastiaans
    <bastiaans ( a t ) sara ( d o t ) nl>

This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
Ganglia-developers mailing list
Ganglia-developers at lists.sourceforge.net

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20060310/59ce221b/attachment.html

More information about the torqueusers mailing list