[torqueusers] [torquedev] GPU Hardware
fotis at cern.ch
Thu Nov 4 06:42:42 MDT 2010
If someone is willing to provide the commands/APIs for reporting which user
is owner of which GPU (just as pbsnodes -a does for CPUs) I would be glad
to try to write the hooks for this one: (others have asked it already)
I guess this is part of the same discussion as the GPU resources allocation.
(=whomever keeps the allocation state machine should report its status too)
I have heard lots of corridor discussions on the subject so far...
This thread looks like there's a need for a Special Interest Group/list etc.
If someone creates one such GPU-allocation SIG/list please count me in.
(but OTOH people may wish to keep it included here for now, anyhow).
On 04/11/2010 12:39, "Mgr. Šimon Tóth" wrote:
>>>> Several cards in one machine. Users should be able to
>>>> select the amount of cards (or even specific type of
>>>> card), Torque needs to make sure that each job will
>>>> get its own requested cards (dedicated).
>>> I guess the simplest way to do this would be for Torque
>>> to set all the GPU cards to root only access on startup
>>> (if no jobs are running) and then set file permissions
>>> appropriately per job.
>>> The main issue there will be if a user starts two jobs
>>> on the same box then there will be the possibility of
>>> clashes over which GPUs it can access.
>> We are using a group for access to the gpu devices (using pro/epilogue).
>> It's not great and we're looking forward to doing something better...
>> (dual socket quad core, dual fermi-gpu nodes)
> I inquired about this on stackoverflow:
> torqueusers mailing list
> torqueusers at supercluster.org
echo "sysadmin know better bash than english" | sed s/min/mins/ \
| sed 's/better bash/bash better/' # Yelling in a CERN forum
More information about the torqueusers