[torqueusers] Torque for CUDA devices -- does Torque set CUDA_VISIBLE_DEVICES?

Bill Wichser bill at Princeton.EDU
Thu Jun 13 06:50:10 MDT 2013


It doesn't.  The only provision is to use the $PBS_GPUFILE to obtain the 
values.  We have been instructing users to use this variable to set the 
visible devices. 
(http://www.princeton.edu/researchcomputing/computational-hardware/tiger/tutorials/)

This works fine for single nodes.  We are still unclear about what to do 
when users want say 6 GPUs - 4 on one node and 2 on another.

Bill

On 03/18/2013 07:53 AM, Jan-Philip Gehrcke wrote:
> Hello,
>
> I am planning to use Torque for managing a small cluster of nodes
> containing CUDA-capable devices. I have read
>
> http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/3-nodes/NVIDIAGPGPUs.htm
>
> and
>
> http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/3-nodes/schedulingGPUs.htm
>
> However, I am still wondering how Torque makes sure that the job runs on
> the CUDA device that it should run on. In other words: does Torque set
> the CUDA_VISIBLE_DEVICES environment variable before executing the
> actual job program? If it does not, what is the best way for the job
> program to retrieve the GPU ID assigned by Torque?
>
> Thanks,
>
> Jan-Philip Gehrcke
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list