問題描述
我已經(jīng)到處搜索有關(guān)如何在 CUDA 中使用類的一些見解,雖然普遍認(rèn)為它可以完成并且顯然是由人們完成的,但我很難找到如何真正做到這一點(diǎn).
我有一個(gè)類,它實(shí)現(xiàn)了一個(gè)帶有運(yùn)算符重載等的基本位集.我需要能夠在主機(jī)和設(shè)備上實(shí)例化此類的對象,在兩者之間進(jìn)行復(fù)制等.我是否在 .cu 中定義了此類?如果是這樣,我如何在我的主機(jī)端 C++ 代碼中使用它?類的函數(shù)不需要訪問像threadId這樣的特殊CUDA變量;它只需要能夠用于主機(jī)和設(shè)備端.
感謝您的幫助,如果我以完全錯(cuò)誤的方式處理這個(gè)問題,我很樂意聽到替代方案.
在 #include 的頭文件中定義類,就像在 C++ 中一樣.
任何必須從設(shè)備代碼調(diào)用的方法都應(yīng)該使用 __device__
和 __host__
聲明規(guī)范,包括構(gòu)造函數(shù)和析構(gòu)函數(shù),如果您打算使用 new
/delete
在設(shè)備上(注意 new
/delete
需要 CUDA 4.0 和計(jì)算能力 2.0 或更高的 GPU).>
你可能想定義一個(gè)像
這樣的宏#ifdef __CUDACC__#define CUDA_CALLABLE_MEMBER __host__ __device__#別的#define CUDA_CALLABLE_MEMBER#萬一
然后在你的成員函數(shù)上使用這個(gè)宏
class Foo {民眾:CUDA_CALLABLE_MEMBER Foo() {}CUDA_CALLABLE_MEMBER ~Foo() {}CUDA_CALLABLE_MEMBER void aMethod() {}};
這樣做的原因是只有 CUDA 編譯器知道 __device__
和 __host__
—— 你的主機(jī) C++ 編譯器會(huì)引發(fā)錯(cuò)誤.
注意 __CUDACC__
已定義由 NVCC 在編譯 CUDA 文件時(shí)使用.這可以是在使用 NVCC 編譯 .cu 文件時(shí),也可以是在使用命令行選項(xiàng) -x cu
編譯任何文件時(shí).
I've searched all over for some insight on how exactly to use classes with CUDA, and while there is a general consensus that it can be done and apparently is being done by people, I've had a hard time finding out how to actually do it.
I have a class which implements a basic bitset with operator overloading and the like. I need to be able to instantiate objects of this class on both the host and the device, copy between the two, etc. Do I define this class in a .cu? If so, how do I use it in my host-side C++ code? The functions of the class do not need to access special CUDA variables like threadId; it just needs to be able to be used host and device side.
Thanks for any help, and if I'm approaching this in completely the wrong way, I'd love to hear alternatives.
Define the class in a header that you #include, just like in C++.
Any method that must be called from device code should be defined with both __device__
and __host__
declspecs, including the constructor and destructor if you plan to use new
/delete
on the device (note new
/delete
require CUDA 4.0 and a compute capability 2.0 or higher GPU).
You probably want to define a macro like
#ifdef __CUDACC__
#define CUDA_CALLABLE_MEMBER __host__ __device__
#else
#define CUDA_CALLABLE_MEMBER
#endif
Then use this macro on your member functions
class Foo {
public:
CUDA_CALLABLE_MEMBER Foo() {}
CUDA_CALLABLE_MEMBER ~Foo() {}
CUDA_CALLABLE_MEMBER void aMethod() {}
};
The reason for this is that only the CUDA compiler knows __device__
and __host__
-- your host C++ compiler will raise an error.
Edit:
Note __CUDACC__
is defined by NVCC when it is compiling CUDA files. This can be either when compiling a .cu file with NVCC or when compiling any file with the command line option -x cu
.
這篇關(guān)于CUDA 和類的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!