In CUDA, what is memory coalescing, and how is it achieved?
It’s likely that this information applies only to compute capabality 1.x, or cuda 2.0. More recent architectures and cuda 3.0 have more sophisticated global memory access and in fact “coalesced global loads” are not even profiled for these chips. Also, this logic can be applied to shared memory to avoid bank conflicts. A coalesced memory … Read more