[torquedev] [Bug 95] Support for GPUs

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Thu Nov 4 10:43:59 MDT 2010


--- Comment #6 from Simon Toth <SimonT at mail.muni.cz> 2010-11-04 10:43:59 MDT ---
(In reply to comment #5)
> >Counted resources are supported by Bug 67, that ensures correct assignment of
> >jobs requesting GPUs.
> By elevating the GPU to the same level as ppn the GPU is now a counted
> resource. Moreover, we can now create a node spec that can specifiy how many
> processors and GPUs are needed for a job. For example:
> qsub -l nodes=hostA:ppn=2:gpu=1 <job.sh>
> This will allocate two np and one gpu on hostA. We can do multiple node
> assignments as well.
> qsub -l nodes=2:ppn=2:gpu=1+2:ppn=2:gpu=2,mem=4Gb <job.sh>
> We have now requested two nodes with two np each and 1 gpu each plus 2 more
> nodes with two np and two gpu each.
> The configuration and syntax fit easily in the current TORQUE build. It is also
> generic as to what a gpu is. 
> Later we can add the syntax to qsub to support exclusive access and other
> features of gpus. We could also add an auto-detect feature that would populate
> each host with the number of gpus available plus report statistics in pbsnodes
> for the gpus.
> Another advantage of this syntax is that it can fit easily into the existing tm
> interface. MPI would not need to make many changes if any at all to manage gpus
> on multiple MOMs.

Uhm. OK, I'm really sorry but I really don't understand this post. Yes that is
how my patch work. I know that. :-) I wrote it myself, so it would be kind of
weird if I wouldn't. :-)

Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the torquedev mailing list