[torquedev] job arrays?

Lennart Karlsson Lennart.Karlsson at nsc.liu.se
Fri Apr 7 17:42:08 MDT 2006


Andy,

You wrote:
> If I have a job that I want to run with 500 parameters, but I have 100 
> computers and 20 other users with limits of 20 nodes per person.  So I 
> submit my job array of 500 jobs, and they start when and where they can 
> within the constraints of the scheduler - to the scheduler it looks like 
> 500 jobs.  qsub, qstat, qdel, etc. , though, treat it as one job by 
> default, so qdel'ing it kills all of them.  There would be an option to 
> qstat to get details out of a job array.
> 
> My weak understanding of mpiexec is that it doesn't do this.
> 
> Does that make sense?  I am struggling with it myself, so any dialog would 
> be appreciated.


Within HPC4U, an EC funded research project (http://www.hpc4u.org) aiming
for an SLA defined high fault tolerance level of cluster and grid job runs,
I have written a small tool (te - test engine) that does something similar.

The tool is set up with a number of test cases, each test case containing
a job script, a binary to run, some input data and a verification script.
The verification script looks at the output data to find if the test has run
to completion and also may try to see if the output looks "good" in some
predefined way, like having a certain file size and having answers within a
certain value range.

To run a test means to run a test case, check for completion, save the
results and verify the results.

The similarity to your problem comes at the aggregate level. In the test tool
you can make an combined test case, containing several simple test cases,
and handle the aggregate test case in the same way as a simple one.

An aggregated test case is saved within the test tool as a file,
containing the names (actually directory names within a certain
directory) of all the test cases contained within it, which makes
it straightforward to implement the same operators for an aggregated test
case as you have for a simple test case.

Of course everything is done on a meta-level. You write commands like

	te start grandtestcase		# Start a test
	te status grandtestcase		# See if it is finished
	te save grandtestcase		# Save results, so it can be verified
	te verify grandtestcase		# Verify results
	te stop grandtestcase		# Terminate the test prematurely
	te delete grandtestcase		# Remove all traces of the test
	te show grandtestcase		# Show definition of test case

and the tool runs 'qstat', 'showq', and other such commands to implement this,
if Torque and Maui is used. (Within HPC4U, CCS is mainly used as scheduler and
queueing system.)

This presentation is just to give you an idea about a different way to
solve your problem. I am not selling/promoting the tool itself, in fact
it is not packaged in a way suited for distribution outside the project.

In a way it would be better to have job aggregation integrated in the
queueing system, as you wish, but perhaps this makes the queueing
system unnecessarily complex; Unix is all about making lean, single-minded
programs interoperate, isn't it? :-) And perhaps you would always like
to be able to add something more, like some verification step for your jobs, ensuring at least e.g. that you do not have empty output
files after an aggregated run. So I propose that you look into doing
some tool in the same line as my test engine.

-- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
   National Supercomputer Centre in Linkoping, Sweden
   http://www.nsc.liu.se




More information about the torquedev mailing list