版权声明:本文为博主原创文章,未经作者允许请勿转载。 https://blog.csdn.net/heiheiya https://blog.csdn.net/heiheiya/article/details/82014822
如果要知道设备拥有多少内存以及具备哪些功能,就需要查询设备。通过调用cudaDeviceCount()函数,可以对每个设备进行迭代,查询每个设备的相关信息。CUDA运行时将返回一个cudaDeviceProp类型的结构,其中就包含了设备的相关属性。
struct cudaDeviceProp {
char name[256];
size_t totalGlobalMem;
size_t sharedMemPerBlock;
int regsPerBlock;
int warpSize;
size_t memPitch;
int maxThreadsPerBlock;
int maxThreadsDim[3];
int maxGridSize[3];
size_t totalConstMem;
int major;
int minor;
int clockRate;
size_t textureAlignment;
int deviceOverlap;
int multiProcessorCount;
int kernelExecTimeoutEnabled;
int integrated;
int canMapHostMemory;
int computeMode;
int maxTexture1D;
int maxTexture2D[2];
int maxTexture3D[3];
int maxTexture2DArray[3];
int concurrentKernels;
}
部分属性的含义如下:
cudaDeviceProp prop;
int count;
cudaError_t err = cudaGetDeviceCount(&count);
for (int i=0; i < count; i++)
{
err = cudaGetDeviceProperties(&prop, i);
printf("------------Device %d--------------\n", i);
printf("Name: %s\n", prop.name);
printf("Computer capability: %d.%d\n", prop.major, prop.minor);
printf("Clock rate: %d\n", prop.clockRate);
printf("Total global memory: %lld\n", prop.totalGlobalMem);
printf("Total constant memory: %ld\n", prop.totalConstMem);
printf("Shared memory per mp: %ld\n", prop.sharedMemPerBlock);
printf("Registers memory per mp: %ld\n", prop.regsPerBlock);
printf("Threads in warp: %d\n", prop.warpSize);
printf("Max threads per block: %d\n", prop.maxThreadsPerBlock);
}