cudaMemcpy segmentation fault

I believe I know what the problem is, but to confirm it, it would be useful to see the code that you are using to set up the Grid_dev classes on the device.

When a class or other data structure is to be used on the device, and that class has pointers in it which refer to other objects or buffers in memory (presumably in device memory, for a class that will be used on the device), then the process of making this top-level class usable on the device becomes more complicated.

Suppose I have a class like this:

class myclass{
  int myval;
  int *myptr;
  }

I could instantiate the above class on the host, and then malloc an array of int and assign that pointer to myptr, and everything would be fine. To make this class usable on the device and the device only, the process could be similar. I could:

cudaMalloc a pointer to device memory that will hold myclass
(optionally) copy an instantiated object of myclass on the host to the device pointer from step 1 using cudaMemcpy
on the device, use malloc or new to allocate device storage for myptr

The above sequence is fine if I never want to access the storage allocated for myptr on the host. But if I do want that storage to be visible from the host, I need a different sequence:

cudaMalloc a pointer to device memory that will hold myclass, let’s call this mydevobj
(optionally) copy an instantiated object of myclass on the host to the device pointer mydevobj from step 1 using cudaMemcpy
Create a separate int pointer on the host, let’s call it myhostptr
cudaMalloc int storage on the device for myhostptr
cudaMemcpy the pointer value of myhostptr from the host to the device pointer &(mydevobj->myptr)

After that, you can cudaMemcpy the data pointed to by the embedded pointer myptr to the region allocated (via cudaMalloc) on myhostptr

Note that in step 5, because I am taking the address of this pointer location, this cudaMemcpy operation only requires the mydevobj pointer on the host, which is valid in a cudaMemcpy operation (only).

The value of the device pointer myint will then be properly set up to do the operations you are trying to do. If you then want to cudaMemcpy data to and from myint to the host, you use the pointer myhostptr in any cudaMemcpy calls, not mydevobj->myptr. If we tried to use mydevobj->myptr, it would require dereferencing mydevobj and then using it to retrieve the pointer that is stored in myptr, and then using that pointer as the copy to/from location. This is not acceptable in host code. If you try to do it, you will get a seg fault. (Note that by way of analogy, my mydevobj is like your Grid_dev and my myptr is like your cdata)

Overall it is a concept that requires some careful thought the first time you run into it, and so questions like this come up with some frequency on SO. You may want to study some of these questions to see code examples (since you haven’t provided your code that sets up Grid_dev):

More Related Contents:

Leave a Comment Cancel reply