以下内容来源于cuda c programming guide
注意:函数运算完之后,会将运算结果保存在第一个参数指针指定的空间位置,返回的是old值。
B.12.1.1. atomicAdd()
int atomicAdd(int* address, int val); unsigned int atomicAdd(unsigned int* address,
unsigned int val); unsigned long long int atomicAdd(unsigned long long int* address,
unsigned long long int val); float atomicAdd(float* address, float val);
double atomicAdd(double* address, double val);
reads the 32-bit or 64-bit word old located at the address address in global or shared memory, computes (old + val), and stores the result back to memory at the same address. These three operations are performed in one atomic transaction. The function returns old.
The 32-bit floating-point version of atomicAdd() is only supported by devices ofcompute capability 2.x and higher.
The 64-bit floating-point version of atomicAdd() is only supported by devices ofcompute capability 6.x and higher.