[torqueusers] building a department GPU cluster
hector.ohhm at gmail.com
Sat Jan 19 07:22:02 MST 2013
I'm Hector Oliver.
i suggest some configuratios but the most important it's which GPUs you
1) is the above HW suitable for a small (2 to 4/6 GPUs) GPU cluster?
which GPUs you think use?
if use Tesla Fermi, not use mora than 2 by WS.
if use Keppler you can use mora than 2 by WS.
i suggest use almost 4GBfor core of CPU.
2) is torque suitable (or what should we use?) as a queuing and resource
management system? We would like the cluster to be usable by many users
at once in a way that no user has to worry about resources, just like we
do on the CPU cluster with SGE.
Torque its nice but the lastest releases
3) What distribution of linux would be more appropriate?
taht you prefer (centOS, etc)
4) necessary stack of sw? (cuda, torque, hadoop?, other?)
Best Regards in advance.
On Thu, Jan 17, 2013 at 8:44 AM, Roberto Nunnari
<roberto.nunnari at supsi.ch>wrote:
> Hi all.
> I'm writing to you to ask for advice or a hint to the right direction.
> In our department, more and more researchers ask us (IT administrators)
> to assemble (or to buy) GPGPU powered workstations to do parallel
> As I already manage a small CPU cluster (resources managed using SGE),
> with my boss we talked about building a new GPU cluster. The problem is
> that I have no experience at all with GPU clusters.
> Apart from the already running GPU workstations, we already have some
> new HW that looks promising to me as a starting point for temporary
> building and testing a GPU cluster.
> - 1x Dell PowerEdge R720
> - 1x Dell PowerEdge C410x
> - 1x NVIDIA M2090 PCIe x16
> - 1x NVIDIA iPASS Cable Kit
> I'd be grateful if you could kindly give me some advice and/or hint to
> the right direction.
> In particular I'm interested on your opinion on:
> 1) is the above HW suitable for a small (2 to 4/6 GPUs) GPU cluster?
> 2) is torque suitable (or what should we use?) as a queuing and resource
> management system? We would like the cluster to be usable by many users
> at once in a way that no user has to worry about resources, just like we
> do on the CPU cluster with SGE.
> 3) What distribution of linux would be more appropriate?
> 4) necessary stack of sw? (cuda, torque, hadoop?, other?)
> Thank you very much for your valuable insight!
> Best regards.
> torqueusers mailing list
> torqueusers at supercluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers