[torquedev] [Bug 95] Support for GPUs

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Thu Nov 18 09:19:28 MST 2010


dbeer at adaptivecomputing.com changed:

           What    |Removed                     |Added
         AssignedTo|glen.beane at gmail.com        |dbeer at adaptivecomputing.com

--- Comment #21 from dbeer at adaptivecomputing.com 2010-11-18 09:19:27 MST ---
> Looking at the code in 2.5-fixes, how will the program actually know which
> cards are allocated? Will the ids match the devices?

Our first pass idea is to do what TORQUE does with cores (ppn, virtual
processors, however they should be referred to). An admin is allowed to
overload their cores if desired - they can set a 4 core machine to ppn=8 or
anything they like. There is also no guarantee that, if they are assigned
host/0 (theoretically the 0th core) that the job will actually execute on the
0th core.

We can imagine that some site is going to want to overload their gpus, just as
some sites do with cpus, and so our initial approach is to handle gpus exactly
the same way cores are handled by default. It is up to the user to guarantee
that they actually execute on the GPU(s) assigned to their job, by reading the
file $PBS_GPUFILE. Eventually, we will add options to lock GPUs to their jobs
(like cpusets) and to autodetect the number and types of GPUs on each system.
This is something we will eventually do but not something TORQUE can handle at
this point.

Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the torquedev mailing list