allocating shared memory

CUDA supports dynamic shared memory allocation. If you define the kernel like this: __global__ void Kernel(const int count) { extern __shared__ int a[]; } and then pass the number of bytes required as the the third argument of the kernel launch Kernel<<< gridDim, blockDim, a_size >>>(count) then it can be sized at run time. Be … Read more