[torqueusers] Same job on several nodes
"Mgr. Šimon Tóth"
SimonT at mail.muni.cz
Thu Feb 4 11:24:22 MST 2010
>>> I am building a heterogeneous cluster so as to compare performance of
>>> the same program on various hardware architectures. For this purpose, I
>>> was advised to use torque.
>>> Thus, I am looking forward to execute the very same job on all nodes of
>>> my cluster. So far, I've considered '-t 1-n' and '-l nodes=n' qsub
>>> options but none appears to fit my need.
>>> Indeed, on the one hand, '-l nodes=n' reserves n nodes but won't spread
>>> the sequential job, and, on the other hand, '-t 1-n' will spawn n jobs
>>> but won't necessarily attach them to n different nodes. So, what I want
>>> is some kind of mix of both options : n jobs run on n different nodes.
>>> Do you know of a means to do this ? Of course, I could iterate over the
>>> nodes hostnames and attach that many jobs to each node... But I wouldn't
>>> come to this end if there is a more straightforward way.
>> I suppose the best way to do this is to add corresponding properties to
>> nodes (describing the architecture) and simply generate the necessary
>> amount of jobs with -l nodes=1:property.
> I'm not sure to get it right. By "generate the necessary amount of
> jobs", do you mean doing so by that many individual calls to qsub ? And
> using "property" to attach each to the desired node, I guess.
> Am I correct ?
> As a first try, I did :
> for n in `cat nodes`; do (echo hostname | qsub -l nodes=$n); done;
> Is it more or less what you are talking about ?
Well, I would do:
for i in `cat architectures`;
echo hostname | qsub -l nodes=1:$i;
If you just have one computer for each architecture, then there really
isn't much point in using a batch system. What I meant was to tag the
nodes (set the property attribute) with the features they provide (OS,
Architecture, etc...) and then submit jobs requesting these features.
Mgr. Šimon Tóth
More information about the torqueusers