[torqueusers] Same job on several nodes

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Fri Jan 29 15:28:56 MST 2010


> I am building a heterogeneous cluster so as to compare performance of
> the same program on various hardware architectures. For this purpose, I
> was advised to use torque.
> 
> Thus, I am looking forward to execute the very same job on all nodes of
> my cluster. So far, I've considered '-t 1-n' and '-l nodes=n' qsub
> options but none appears to fit my need. 
> 
> Indeed, on the one hand, '-l nodes=n' reserves n nodes but won't spread
> the sequential job, and, on the other hand, '-t 1-n' will spawn n jobs
> but won't necessarily attach them to n different nodes. So, what I want
> is some kind of mix of both options : n jobs run on n different nodes.
> 
> Do you know of a means to do this ? Of course, I could iterate over the
> nodes hostnames and attach that many jobs to each node... But I wouldn't
> come to this end if there is a more straightforward way.

I suppose the best way to do this is to add corresponding properties to
nodes (describing the architecture) and simply generate the necessary
amount of jobs with -l nodes=1:property.

> Moreover, provided I overcome the first step, I will be interested in
> gathering performance measures (CPU load, RAM used, job duration...)
> from all job executions to compare the results. Is there an easy way to
> do so ? Can moab help in this regards ?

Torque records cpu time, walltime (run time), memory and virtual memory
usage. Check accounting information.

-- 
Mgr. Šimon Tóth


More information about the torqueusers mailing list