[torqueusers] [Patch] GPUs by the way of GRES

Sean Reilly sean.reilly at ersa.edu.au
Thu Apr 19 23:54:34 MDT 2012

Hi Folks

  We have just begun testing with this Patch - Great Work by Jonathan Michalon !
  So far it is doing everything we need.

  Roland - as you still need to lock the assigned GPU's down to a particular
user and Job ID on the backend nodes - we do this with cuda_wrapper
   there is no real need for Maui to specify the particular gpu  eg gpu/2

  We use both the torque #PBS -l gpus=1
  and the Maui           #PBS -W x=GRES:gpu at 1

  Maui side+Patch  takes care of the number of gpu's being available
  Torque gives you the environment variable  PBS_RESOURCE_GRES=gpus=1

    The prologue script is responsible for assigning an available gpu to this
user and JobID.
    When job finishes or is killed - epilogue release the gpu back into the pool.

    These two scripts should be aware of the gpus avail and in use at any time.
    - As Maui has ensured they should be available. *if not then the prologue
and epilogue can send admin an Error email so it can be checked.*

    Its early days for us - but so far so good.

    But yes it would be a nice if Maui could tell the backend nodes about the
number of GPU's assigned (and possibly the device number) : eliminating the need
for the extra #PBS -l gpus=1 setting. But not a show stopper.


