[Mauiusers] qtop: yet another tool to struggle with torque and PBS family systems

Fotis Georgatos fotis at cern.ch
Sat Oct 16 02:52:40 MDT 2010


Hi,

since maui had been THE cause for starting writing qtop shell scripts
some years ago, I forward the below appended message here, too.

btw.
If someone of you is aware on any kind of hard guarantees of scheduling
with maui or else, it would be very interesting to hear your feedback.
What I am specifically looking for, are computer science clauses of the type:
"given such and such algorithm in the scheduler, max queue delay is so much"
ie. the scheduling building block for Urgent Computing applications.
If, any of you has hands-on experience on the matter, please contact back.

thank you in advance for any answer,

Fotis

-------- Original Message --------
Subject: Fwd: qtop: yet another tool to struggle with torque and PBS family 
systems
Date: Fri, 15 Oct 2010 20:56:42 +0300
From: Fotis Georgatos <fotis at cern.ch>
Organization: CERN
To: Torque Users Mailing List <torqueusers at supercluster.org>


Hi,

thank you to all the people who provided feedback since 1st release of qtop;
now a new release is out with an rpm package available,
and multiple pending requests have been addressed:

* fixed visual output for sites with 1000s of cores/jobs
* highlighting of queues in the header (coloring of queue names, like jobs)
* added command-line options for most of the tunable parameters
* separately report  Running + Queued  jobs.
* provide a default /etc/qtop.conf
* an example is provided on how to convert the output of qtop in a colored
web-page
   (and if you see funny output rendering in your smartphone, well that's a
WebKit bug ;-)

Like before, direct yourselves at this pointer:
https://twiki.cscs.ch/twiki/bin/view/DECH/QTOP

cheers,
Fotis

ps.
I'm using the list communication channel out of convenience, and thanks for that.

-------- Original Message --------
Subject: qtop: yet another tool to struggle with torque and PBS family systems
Date: Wed, 01 Sep 2010 01:20:29 +0200
From: Fotis Georgatos <fotis at cern.ch>
Organization: CERN
To: torqueusers at supercluster.org
CC: Fotis Georgatos <fotis at cern.ch>


Hi,

it is not uncommon for shepherds of PBS-family based clusters to wander around
in the system, trying to understand where users' jobs and site resources graze.
Or, you may just try to understand if you are being hit by something like a
bug. (*)

Fortunately,
you are not alone in this world and others have same troubles as you do ;-).
Here is a script I wrote to try to get better control over any torque or pbs
instance:
https://twiki.cscs.ch/twiki/bin/view/DECH/QTOP
It provides a brief summary of your PBS and Nodes status, along with a job matrix.
In case you ask, the CPUids are the ones reported at command pbsnodes -a,
so you may wish to check that its output seems reasonable, before trying qtop.

It can be particularly useful if you assign colors to your scheduler's policy
groups,
so that you can visually check if your policy is honored; be prepared for
surprises.
Generally, I hope it can help you to keep the entropy of a torque system low,
by giving a fast overview of what is going on.

enjoy,
Fotis

(*)
http://www.supercluster.org/pipermail/torqueusers/2010-August/011198.html



More information about the mauiusers mailing list