[torqueusers] ANNOUNCE: Public release of Ganglia Job Monarch v0.1.0
Bas van der Vlies
basv at sara.nl
Sat Mar 11 06:30:40 MST 2006
This software package replaces ganglia_pbs
Begin forwarded message:
> From: Ramon Bastiaans <bastiaans at sara.nl>
> Date: March 10, 2006 6:33:10 PM GMT+01:00
> This is the first initial and public open source release of:
> "Ganglia Job Monarch", the Job Monitoring and Archiving tool and
> is a addon to Ganglia.
> This release is: ganglia_jobmonarch-0.1.0
> It is available here: ftp://ftp.sara.nl/pub/outgoing/
> See the INSTALL file on how to set it up.
> Job Monarch is a set of tools to monitor and optionally archive
> (batch)job information.
> It is a addon for the Ganglia monitoring system and plugs in to
> a existing Ganglia setup.
> To view a operational setup with Job Monarch, have a look here:
> Job Monarch stands for 'Job Monitoring and Archiving' tool and
> consists of three (3) components:
> * jobmond
> The Job Monitoring Daemon.
> Gathers PBS/Torque batch statistics on jobs/nodes
> and submits them into
> Ganglia's XML stream.
> Through this daemon, users are able to view the PBS/Torque
> batch system and the
> jobs/nodes that are in it (be it either running or queued).
> * jobarchived (optionally)
> The Job Archiving Daemon.
> Listens to Ganglia's XML stream and archives the job and
> node statistics.
> It stores the job statistics in a Postgres SQL database and
> the node statistics
> in RRD files.
> Through this daemon, users are able to lookup a old/
> finished job
> and view all it's statistics.
> Optionally: You can either choose to use this daemon if your
> users have use for it.
> As it can be a heavy application to run and not everyone may
> have a need for it.
> - Multithreaded: Will not miss any data regardless of
> (slow) storage
> - Staged writing: Spread load over bigger time periods
> - High precision RRDs: Allow for zooming on old periods
> with large precision
> - Timeperiod RRDs: Allow for smaller number of files
> while still keeping advantage of small disk space
> * web
> The Job Monarch web interface.
> This interfaces with the jobmond data and (optionally) the
> jobarchived and presents the
> data and graphs.
> It does this in a similar layout/setup as Ganglia itself, so
> the navigation and usage is intuitive.
> - Graphical usage: Displays graphical cluster overview so
> you can see the cluster (job) state
> in one view/image and additional pie chart with
> relevant information on your
> current view
> - Filters: Ability to filter output to limit
> information displayed (usefull for those
> clusters with 500+ jobs). This also filters the
> graphical overview images output
> and pie chart so you only see the filter
> relevant data
> - Archive: When enabling jobarchived, users can go
> back as far as recorded in the database
> or archived RRDs to find out what happened to a
> crashed or old job
> - Zoom ability: Users can zoom into a timepriod as
> small as the smallest grain of the RRDS
> (typically up to 10 seconds) when a jobarchived
> is present
> You can view a operational Ganglia Job Monarch setup here: http://
> Any information/suggestions/hatemail/bugreports/whatever to:
> Ramon Bastiaans
> <bastiaans ( a t ) sara ( d o t ) nl>
Bas van der Vlies
basv at sara.nl
More information about the torqueusers