[torqueusers] [Patch] GPUs by the way of GRES

rf at q-leap.de rf at q-leap.de
Mon Mar 5 11:06:33 MST 2012

>>>>> "Jonathan" == Jonathan Michalon <jonathan.michalon at etu.unistra.fr> writes:

Hi Jonathan,

while your patch adds some functionality to count allocated GPUs as
a GRES, it lacks the important functionality to tell the job which GPUs
are available for it. If latest torque 2.5.x is built with GPU support,
you have the option to specify a nodes spec like "-l nodes=1:gpus=1" and
within the running job you can check $GPUFILE what GPUs you're
allocated. Now the problem is that a job with a "-l nodes=1:gpus=1"
specification won't be started with maui even if it has your patch. On
the other hand, using your "-W x=GRES:gpu at 1" spec (without a "-l
nodes=1:gpus=1" spec) makes the job run, but
it doesn't have an idea which GPU to use. Is there an easy way to extend
your patch, so that maui will make a job run with the "-l
nodes=1:gpus=1" spec?



    Jonathan> Hi Maui folks, GPUs in Maui are a long standing
    Jonathan> problem. Last year a patch was sent by Mariusz Mamoński
    Jonathan> [1], which works based on GRES parameters.  I've just made
    Jonathan> GPUs kind of working, by enhancing that patch. Please find
    Jonathan> attached the resulting patch, which works well for Maui
    Jonathan> 3.3.1.  It defines a special GRES named "gpu" which works
    Jonathan> as expected on my test cases.

    Jonathan> Note that GRES behaviour seems quite confused as sometimes
    Jonathan> they are mentioned as consumable. This patch annihilates
    Jonathan> this behaviour, for the needs of GPUs.

    Jonathan> To use the patch: get the sources of maui-3.3.1 and patch
    Jonathan> them: patch -p1 < ../Patch-for-gpu-GRES.patch then compile
    Jonathan> as usual.

    Jonathan> You have to configure the GPUs in maui.cfg:
    Jonathan> NODECFG[nodename] GRES=gpu:2

    Jonathan> Then when queuing jobs you can request GPUs with (Torque
    Jonathan> syntax): qsub -W x=GRES:gpu at 1

    Jonathan> I hope this helps, please test this and enhance to your
    Jonathan> needs!

    Jonathan> [1]
    Jonathan> http://www.supercluster.org/pipermail/mauiusers/2011-April/004622.html

    Jonathan> Regards,

    Jonathan> PS. This is the second attempt to send the mail…

    Jonathan> -- Jonathan Michalon IT student in Strasbourg

More information about the torqueusers mailing list