[torqueusers] configuring node to use subset of available physical GPUs

Lev Givon lev at columbia.edu
Mon Mar 31 14:42:24 MDT 2014


Received from David Beer on Mon, Mar 31, 2014 at 04:34:25PM EDT:
> On Mon, Mar 31, 2014 at 12:47 PM, Lev Givon <lev at columbia.edu> wrote:
> 
> > If I configure a compute node (in its server_priv/nodes file) to use X
> > number of GPUs where X < N and N = total number of physical GPUs in the
> > system, are the first X physical GPUs in the system always the ones that are
> > allocated to jobs that require GPUs? In other words, does the above
> > configuration guarantee that torque will never allocate the remaining N-X
> > remaining GPUs to jobs?
> >
> > I'm using torque 4.5.0pre1 on Ubuntu 13.10 with the built-in scheduler.
>
> Let's say you have 4 gpus but only want 2 to be used for jobs:
> 
> 1. Make sure you aren't allowing it to auto-detect gpus. (This happens when
> you configure the moms to report on each gpu, then -nvidia configure
> options).
> 2. In the nodes file, add gpus=2 to the line with the node.
> 
> This doesn't guarantee that a job is unable to access the other gpus on the
> system, but it guarantees that TORQUE will only tell the scheduler about 2
> gpus, so more than 2 should never be scheduled at a time.

Is there any way to prevent torque from ever touching a specific GPU (or GPUs)
on a system? The motivation for the question is to set aside those GPUs for
non-torque-related use by potentially more than one simultaneous user and have
torque use the remaining GPUs exclusively for submitted jobs.
-- 
Lev Givon
Bionet Group
http://www.columbia.edu/~lev/
http://lebedov.github.io/



More information about the torqueusers mailing list