[torqueusers] Considerations for Clusters Running Lots of Small Jobs

Remo Sanges rsanges at tigem.it
Fri Dec 5 05:57:30 MST 2008


Hi Joshua,

I'm talking from a bioinformatician-user point of view so I have no 
suggestion about configuration of the cluster but I can say that your 
problematic is indeed very common in our field.
 From my perspective you can simply submit a series of job arrays.
TORQUE can actually do that 
http://www.clusterresources.com/torquedocs21/2.1jobsubmission.shtml
There is a note on this page saying that job arrays are still under 
development so, if you don't feel comfortable with that,  a similar 
solution would be simply to adapt the script which do the analysis in 
order to make it run a chunk of N analysis (where N could be equal to a 
number of your choice defined on the basis of the time every single 
analysis would take). This is probably a quick and dirty solution but is 
the one I use given that I started to use TORQUE before the 
implementation of job arrays and I perfectly fine with it.

Cheers

ERemo

Joshua Bernstein wrote:
> Hi All,
>
>     The TORQUE documentation contains a nice explanation for running 
> TORQUE on a large cluster. But are these ideas also pertinent to say a 
> very small, say four node cluster, running, say many thousands of 
> short lived jobs. Its very common in the BioIT space to have a 
> comparitively small cluster, but with the many thousands of jobs 
> lasting only a few seconds. Does anybody have an guidance on 
> configuration or even source level changes for a high throughput, 
> small cluster, with short lived jobs. Or would we expect the same 
> changes for a large cluster to also be applicable to this configuration?
>
> -Joshua Bernstein
> Software Engineer
> Penguin Computing
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



More information about the torqueusers mailing list