[torqueusers] Same job on several nodes

Vincent LIARD vincent.liard at scilab.org
Fri Jan 29 05:24:48 MST 2010


I am building a heterogeneous cluster so as to compare performance of
the same program on various hardware architectures. For this purpose, I
was advised to use torque.

Thus, I am looking forward to execute the very same job on all nodes of
my cluster. So far, I've considered '-t 1-n' and '-l nodes=n' qsub
options but none appears to fit my need. 

Indeed, on the one hand, '-l nodes=n' reserves n nodes but won't spread
the sequential job, and, on the other hand, '-t 1-n' will spawn n jobs
but won't necessarily attach them to n different nodes. So, what I want
is some kind of mix of both options : n jobs run on n different nodes.

Do you know of a means to do this ? Of course, I could iterate over the
nodes hostnames and attach that many jobs to each node... But I wouldn't
come to this end if there is a more straightforward way.

Moreover, provided I overcome the first step, I will be interested in
gathering performance measures (CPU load, RAM used, job duration...)
from all job executions to compare the results. Is there an easy way to
do so ? Can moab help in this regards ?

I hope my questions are not trivial but I didn't manage to find an
answer by myself so far.

Thanks in advance,

Vincent LIARD
Ingénieur de développement
Consortium Scilab
Domaine de Voluceau
Rocquencourt - B.P. 105
78153 Le Chesnay Cédex

More information about the torqueusers mailing list