cuda 核函数中的参数说明<<<Dg, Db, Ns, S>>>

核函数中有4个参数,分别为grid维度,block维度,每个block在共享内存中动态分配的字节数量,以及cuda stream。

kernel<<<dim_grid, dim_block, num_bytes_in_SharedMem, stream>>>

以下内容参考自cuda8.0 cuda c programming guide.

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html

在章节B.21. Execution Configuration 中有提及

The execution configuration is specified by inserting an expression of the form <<<Dg, Db, Ns, S >>> between the function name and the parenthesized argument list,where:

  • ‣  Dg is of type dim3 (see dim3) and specifies the dimension and size of the grid, suchthat Dg.x * Dg.y * Dg.z equals the number of blocks being launched;

  • ‣  Db is of type dim3 (see dim3) and specifies the dimension and size of each block,such that Db.x * Db.y * Db.z equals the number of threads per block;

  • ‣  Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in __shared__; Ns is an optional argument which defaults to 0;

  • ‣  S is of type cudaStream_t and specifies the associated stream; S is an optionalargument which defaults to 0. 


猜你喜欢

转载自blog.csdn.net/u010454261/article/details/78287472