[torqueusers] qtop: yet another tool to struggle with torque and PBS family systems

Fotis Georgatos fotis at cern.ch
Tue Aug 31 17:20:29 MDT 2010


Hi,

it is not uncommon for shepherds of PBS-family based clusters to wander around
in the system, trying to understand where users' jobs and site resources graze.
Or, you may just try to understand if you are being hit by something like a 
bug. (*)

Fortunately,
you are not alone in this world and others have same troubles as you do ;-).
Here is a script I wrote to try to get better control over any torque or pbs 
instance:
https://twiki.cscs.ch/twiki/bin/view/DECH/QTOP
It provides a brief summary of your PBS and Nodes status, along with a job matrix.
In case you ask, the CPUids are the ones reported at command pbsnodes -a,
so you may wish to check that its output seems reasonable, before trying qtop.

It can be particularly useful if you assign colors to your scheduler's policy 
groups,
so that you can visually check if your policy is honored; be prepared for 
surprises.
Generally, I hope it can help you to keep the entropy of a torque system low,
by giving a fast overview of what is going on.

enjoy,
Fotis

(*)
http://www.supercluster.org/pipermail/torqueusers/2010-August/011198.html


-- 
echo "sysadmin know better bash than english" | sed s/min/mins/ \
	| sed 's/better bash/bash better/' # Yelling in a CERN forum


More information about the torqueusers mailing list