Set the environment variable CUDA_DEVICE_ORDER
as:
export CUDA_DEVICE_ORDER=PCI_BUS_ID
Then the GPU IDs will be ordered by pci bus IDs.
More Related Contents:
- How to measure the inner kernel time in NVIDIA CUDA?
- nvidia-smi Volatile GPU-Utilization explanation?
- How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?
- How is CUDA memory managed?
- Modifying registry to increase GPU timeout, windows 7
- Passing Host Function as a function pointer in __global__ OR __device__ function in CUDA
- How do CUDA blocks/warps/threads map onto CUDA cores?
- CUDA apps time out & fail after several seconds – how to work around this?
- How do I choose grid and block dimensions for CUDA kernels?
- GPU Emulator for CUDA programming without the hardware [closed]
- sending 3d array to CUDA kernel
- Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]
- How are 2D / 3D CUDA blocks divided into warps?
- What can I do against ‘CUDA driver version is insufficient for CUDA runtime version’?
- What is a bank conflict? (Doing Cuda/OpenCL programming)
- Any particular function to initialize GPU other than the first cudaMalloc call?
- CUDA: How many concurrent threads in total?
- Why is NVIDIA Pascal GPUs slow on running CUDA Kernels when using cudaMallocManaged
- How do I select which GPU to run a job on?
- multi-GPU basic usage
- How can I make tensorflow run on a GPU with capability 2.x?
- why do we need cudaDeviceSynchronize(); in kernels with device-printf?
- CUDA determining threads per block, blocks per grid
- What is the canonical way to check for errors using the CUDA runtime API?
- CUDA limit seems to be reached, but what limit is that?
- What is the correct version of CUDA for my nvidia driver?
- Utilizing the GPU with c# [closed]
- What kind of variables consume registers in CUDA?
- How to use 2D Arrays in CUDA?
- Are cuda kernel calls synchronous or asynchronous