nvcc - w3toppers.com

How to disable a specific nvcc compiler warnings

It is actually possible to disable specific warnings on the device with NVCC. It took me ages to figure out how to do it. You need to use the -Xcudafe flag combined with a token listed on this page. For example, to disable the “controlling expression is constant” warning, pass the following to NVCC: -Xcudafe … Read more

Why is NVIDIA Pascal GPUs slow on running CUDA Kernels when using cudaMallocManaged

Under CUDA 8 with Pascal GPUs, managed memory data migration under a unified memory (UM) regime will generally occur differently than on previous architectures, and you are experiencing the effects of this. (Also see note at the end about CUDA 9 updated behavior for windows.) With previous architectures (e.g. Maxwell), managed allocations used by a … Read more

CUDA: How to use -arch and -code and SM vs COMPUTE

Some related questions/answers are here and here. I am still not sure how to properly specify the architectures for code generation when building with nvcc. A complete description is somewhat complicated, but there are intended to be relatively simple, easy-to-remember canonical usages. Compile for the architecture (both virtual and real), that represents the GPUs you … Read more

Why does nvcc fails to compile a CUDA file with boost::spirit?

nvcc sometimes has trouble compiling complex template code such as is found in Boost, even if the code is only used in __host__ functions. When a file’s extension is .cpp, nvcc performs no parsing itself and instead forwards the code to the host compiler, which is why you observe different behavior depending on the file … Read more

CUDA and nvcc: using the preprocessor to choose between float or double

It seems you might be conflating two things – how to differentiate between the host and device compilation trajectories when nvcc is processing CUDA code, and how to differentiate between CUDA and non-CUDA code. There is a subtle difference between the two. __CUDA_ARCH__ answers the first question, and __CUDACC__ answers the second. Consider the following … Read more

Linking error: DSO missing from command line

Hopefully this will be of help to those, like me, who are new to Linux and don’t find anything related to Linux to be particularly obvious. As noted by talonmies, I am not able to link indirectly and as such need to specify any additional libraries required by the libraries I am using. That is … Read more

What is the purpose of using multiple “arch” flags in Nvidia’s NVCC compiler?

Roughly speaking, the code compilation flow goes like this: CUDA C/C++ device code source –> PTX –> SASS The virtual architecture (e.g. compute_20, whatever is specified by -arch compute…) determines what type of PTX code will be generated. The additional switches (e.g. -code sm_21) determine what type of SASS code will be generated. SASS is … Read more

What is option -O3 for g++ and nvcc?

It’s optimization on level 3, basically a shortcut for several other options related to speed optimization etc. (see link below). I can’t find any documentation on it. … it is one of the best known options: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#options-for-altering-compiler-linker-behavior