[torqueusers] Fwd: qtop: yet another tool to struggle with torque and PBS family systems

Fotis Georgatos fotis at cern.ch
Fri Oct 15 11:56:42 MDT 2010


Hi,

thank you to all the people who provided feedback since 1st release of qtop;
now a new release is out with an rpm package available,
and multiple pending requests have been addressed:

* fixed visual output for sites with 1000s of cores/jobs
* highlighting of queues in the header (coloring of queue names, like jobs)
* added command-line options for most of the tunable parameters
* separately report  Running + Queued  jobs.
* provide a default /etc/qtop.conf
* an example is provided on how to convert the output of qtop in a colored 
web-page
   (and if you see funny output rendering in your smartphone, well that's a 
WebKit bug ;-)

Like before, direct yourselves at this pointer:
https://twiki.cscs.ch/twiki/bin/view/DECH/QTOP

cheers,
Fotis

ps.
I'm using the list communication channel out of convenience, and thanks for that.

-------- Original Message --------
Subject: qtop: yet another tool to struggle with torque and PBS family systems
Date: Wed, 01 Sep 2010 01:20:29 +0200
From: Fotis Georgatos <fotis at cern.ch>
Organization: CERN
To: torqueusers at supercluster.org
CC: Fotis Georgatos <fotis at cern.ch>


Hi,

it is not uncommon for shepherds of PBS-family based clusters to wander around
in the system, trying to understand where users' jobs and site resources graze.
Or, you may just try to understand if you are being hit by something like a
bug. (*)

Fortunately,
you are not alone in this world and others have same troubles as you do ;-).
Here is a script I wrote to try to get better control over any torque or pbs
instance:
https://twiki.cscs.ch/twiki/bin/view/DECH/QTOP
It provides a brief summary of your PBS and Nodes status, along with a job matrix.
In case you ask, the CPUids are the ones reported at command pbsnodes -a,
so you may wish to check that its output seems reasonable, before trying qtop.

It can be particularly useful if you assign colors to your scheduler's policy
groups,
so that you can visually check if your policy is honored; be prepared for
surprises.
Generally, I hope it can help you to keep the entropy of a torque system low,
by giving a fast overview of what is going on.

enjoy,
Fotis

(*)
http://www.supercluster.org/pipermail/torqueusers/2010-August/011198.html


-- 
echo "sysadmin know better bash than english" | sed s/min/mins/ \
	| sed 's/better bash/bash better/' # Yelling in a CERN forum


More information about the torqueusers mailing list