What is the purpose of using multiple “arch” flags in Nvidia’s NVCC compiler?
Roughly speaking, the code compilation flow goes like this: CUDA C/C++ device code source –> PTX –> SASS The virtual architecture (e.g. compute_20, whatever is specified by -arch compute…) determines what type of PTX code will be generated. The additional switches (e.g. -code sm_21) determine what type of SASS code will be generated. SASS is … Read more