[torqueusers] [Patch] GPUs by the way of GRES
sean.reilly at ersa.edu.au
Thu Apr 19 23:54:34 MDT 2012
We have just begun testing with this Patch - Great Work by Jonathan Michalon !
So far it is doing everything we need.
Roland - as you still need to lock the assigned GPU's down to a particular
user and Job ID on the backend nodes - we do this with cuda_wrapper
there is no real need for Maui to specify the particular gpu eg gpu/2
We use both the torque #PBS -l gpus=1
and the Maui #PBS -W x=GRES:gpu at 1
Maui side+Patch takes care of the number of gpu's being available
Torque gives you the environment variable PBS_RESOURCE_GRES=gpus=1
The prologue script is responsible for assigning an available gpu to this
user and JobID.
When job finishes or is killed - epilogue release the gpu back into the pool.
These two scripts should be aware of the gpus avail and in use at any time.
- As Maui has ensured they should be available. *if not then the prologue
and epilogue can send admin an Error email so it can be checked.*
Its early days for us - but so far so good.
But yes it would be a nice if Maui could tell the backend nodes about the
number of GPU's assigned (and possibly the device number) : eliminating the need
for the extra #PBS -l gpus=1 setting. But not a show stopper.
More information about the torqueusers