Built-in Functions for Memory Model Aware Atomic Operations

The following built-in functions roughly meet the requirements of the C ++ 11 memory model. They are all identified by the prefix '__atomic', and most are overloaded, so they can handle multiple types.

These functions are used to replace the traditional "__sync" built-in functions. The main difference is that the requested memory order is the function's parameter. New code should always use the "__atomic" built-in code, not the "__sync" built-in code.

Note that the construction of "__atomic" assumes that the program will conform to the C ++ 11 memory model. In particular, they assume that the program has no data competition. See the C ++ 11 standard for details.

The "__atomic" built-in item can be used for any integer scalar or pointer type with a length of 1, 2, 4, or 8 bytes. If the architecture supports '__int128', 16-byte integer types are also allowed.

There is a common version of the four non-arithmetic functions (load, store, swap, and compare swap). This generic version is applicable to any data type. If the specific data type size makes it possible, it uses lock-free built-in functions; otherwise, external calls are resolved at runtime. The format of this external call is the same as that of adding the "size_t" parameter, which is inserted as the first parameter to indicate the size of the object to be pointed to. All objects must be the same size.

You can specify 6 different memory sequences. These memory sequences mapped to C ++ 11 have the same name, see the detailed definition of atomic synchronization in the C ++ 11 standard or GCC Wiki . A single target can also support additional memory sequences for specific architectures. See the target documentation for details.

Atomic operations can either constrain code movement or map to hardware instructions (for example, fences) used for synchronization between threads. The extent to which this happens is controlled by memory order, which is listed in approximately ascending order. The description of each memory sequence is just to roughly illustrate the effect, not the specification; see the exact semantics of the C ++ 11 memory model.

function Explanation
__ATOMIC_RELAXED It means that there is no sorting constraint between threads.
__ATOMIC_CONSUME Due to the lack of semantics of C ++ 11 for memory_order_consume, a stronger __ATOMIC_ACQUIRE memory order is currently used to achieve this.
__ATOMIC_ACQUIRE Create a constraint at the load of acquire before multithreading occurs. This prevents the code from sinking before operation.
__ATOMIC_RELEASE Release the constraint that was created before the multithreading occurred at RELEASE. This prevents the code from sinking after operation.
__ATOMIC_ACQ_REL Combines the effects of __ATOMIC_ACQUIRE and __ATOMIC_RELEASE. 
__ATOMIC_SEQ_CST Enforce total sorting on all other atomic sequential operations.

Note that in the C ++ 11 memory model, fences (for example, '__atomic_thread_fence) work in conjunction with other atomic operations on specific memory locations (such as atomic load); operations on specific memory locations may not necessarily be the same way Affect other operations.

The target architecture is encouraged to provide its own model for each atomic built-in function. If no target is provided, the original non-memory model set "__sync" atomic built-in function, and any necessary synchronization fences around it are used to achieve the correct behavior. In this case, execution is subject to the same restrictions as those built-in functions.

If there is no mode or mechanism for providing a sequence of lock-free instructions, then an external routine with the same parameters is called to resolve at runtime.

 When implementing a pattern for these built-in functions, as long as the pattern implements the strictest __ATOMIC_SEQ_CST memory order, the memory order parameter can be ignored. Any other memory order can be executed correctly using this memory order, but they may not be implemented efficiently with a more appropriate slack requirement implementation.

Note that the C ++ 11 standard allows memory order parameters to be determined at runtime rather than at compile time. These built-in functions map any runtime values ​​to atomic order cst, rather than calling runtime library calls or inline switch statements. This is standard, safe, and the simplest method currently. The memory order parameter is a signed int, but only the lower 16 bits are reserved for memory order. The rest of the int is reserved for target use and should be set to 0 to ensure that the predefined atomic value is used correctly.

type __atomic_load_n(type *ptr,type val,int memorder):

This built-in function implements atomic loading operations, it returns the contents of * ptr. Valid storage order variables are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE, and __ATOMIC_COMSUME.

void __atomic_load(type *ptr,type*ret,int memorder):

This is the general version of atomic loading. It gave * ret the contents of * ptr.

void __atomic_store_n (type *ptr, type val, int memorder)

This built-in function implements an atomic storage operation. It writes val to * ptr. Valid memory order variables are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, __ATOMIC_RELEASE.

void __atomic_store (type *ptr, type *val, int memorder)

This is the general version of atomic storage. It stores the value of * val in * ptr.

type __atomic_exchange_n (type *ptr, type val, int memorder)

This built-in function implements atomic swap operations. It writes val to * ptr and returns the previous content of * ptr. Valid storage order variables are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, __ATOMIC_ACQUIRE, __ATOMIC_RELEASE, and __ATOMIC_ACQ_REL.

void __atomic_exchange (type *ptr, type *val, type *ret, int memorder)

This is the general version of atomic swap and it stores the contents of * val in * ptr. The original value of * ptr is copied into * ret.

bool __atomic_compare_exchange_n (type *ptr, type *expected, type desired, bool weak, int success_memorder, int failure_memorder)

This built-in function implements atomic comparison and swap operations. This will compare the contents of * ptr with the contents of * expected. If they are equal, the operation is a read-modify-write operation that writes desired to * ptr. If they are not equal, the operation is read, and the current contents of * ptr are written to * expected. When weak is true, it is weak compare_exchange, which may be a false failure, and strong variation when it is false. This has never been a false failure. Many targets only provide strong changes and the parameters can be ignored. When in doubt, use strong variation. If desired is written to * ptr, it returns true, and affects the memory according to the memory order specified by success_memorder. There is no limit to the memory order that can be used here.

Otherwise, return False according to failure_memorder and affect memory. This memory sequence cannot be __ATOMIC_RELEASE or __ATOMIC_ACQ_REL. It also cannot be an order stronger than the order specified by success_memorder. 

bool __atomic_compare_exchange (type *ptr, type *expected, type *desired, bool weak, int success memorder, int failure memorder)

This built-in function implements a universal version of atomic comparison exchange. This function is actually the same as the atomic comparison exchange, except that the required value is also a pointer.

type __atomic_add_fetch (type *ptr, type val, int memorder) 
type __atomic_sub_fetch (type *ptr, type val, int memorder) 
type __atomic_and_fetch (type *ptr, type val, int memorder)
type __atomic_xor_fetch (type *ptr, type val, int memorder)
type __atomic_or_fetch (type *ptr, type val, int memorder)
type __atomic_nand_fetch (type *ptr, type val, int memorder)

These built-in functions perform the operation suggested by the name and return the operation result. The operations performed on pointer parameters are as if the operands are of type uintptr_t, that is, they cannot change the size of the pointer's pointing type

{* ptr op = val; return * ptr; }
{* ptr = ~ (* ptr & val); return * ptr; } // nand

The object pointed to by the first parameter must be an integer or pointer type. It cannot be of Boolean type. All memory sequences are valid.

type __atomic_fetch_add (type *ptr, type val, int memorder) 
type __atomic_fetch_sub (type *ptr, type val, int memorder)
type __atomic_fetch_and (type *ptr, type val, int memorder)
type __atomic_fetch_xor (type *ptr, type val, int memorder)
type __atomic_fetch_or (type *ptr, type val, int memorder)
type __atomic_fetch_nand (type *ptr, type val, int memorder)

These built-in functions perform the operation suggested by the name and return the previous value in * ptr. The operations performed on pointer parameters are as if the operands are of type uintptr_t, that is, they cannot change the size of the pointer's pointing type

{ tmp = *ptr; *ptr op= val; return tmp; }
{ tmp = *ptr; *ptr = ~(*ptr & val); return tmp; } // nand

The constraints of the parameters are the same as the constraints of the corresponding __atomic_op_fetch built-in function. All memory sequences are valid.

bool __atomic_test_and_set (void *ptr, int memorder)

This built-in function performs atomic testing and setting operations on the bytes at * ptr. The byte is set to a non-zero "set" value defined by an implementation, and the return value is true only if the previous content is "set". It should only be used for operands of type bool or char. For other types, only partial values ​​can be set. All memory sequences are valid.

void __atomic_clear (bool *ptr, int memorder)

This built-in function performs an atomic clear operation on * ptr. After the operation, * ptr contains 0. It should only be used for operands of type bool or char, and used with __atomic_test_and_set. For other types, it may only be partially cleared. If the type is not bool, __atomic_store is preferred. Valid storage order variables are __ATOMIC_RELAXED, __ATOMIC_SEQ_CST, and __ATOMIC_RELEASE.

void __atomic_thread_fence (int memorder)

This built-in function acts as a synchronization fence between threads according to the specified memory order. All memory sequences are valid.

void __atomic_signal_fence (int memorder)

This built-in function acts as a synchronization barrier between threads and signal handlers based on the same thread. All memory sequences are valid.

bool __atomic_always_lock_free (size t size, void *ptr)

If a byte-sized object always generates a lock-free atomic instruction for the target architecture, this built-in function returns true. The size must be resolved to a compile-time constant, and the result must also be resolved to a compile-time constant. ptr is an optional pointer to an object that can be used to determine the alignment. A value of 0 means that the typical alignment should be used. The compiler can also ignore this parameter. if (__atomic_always_lock_free (sizeof (long long), 0))

bool __atomic_is_lock_free (size t size, void *ptr)

If a byte-sized object always generates a lock-free atomic instruction for the target architecture, this built-in function returns true. If you do n’t know that the built-in function is lock-free, call the runtime routine named "atomic" "is" "lock" ". Ptr is an optional pointer to an object that can be used to determine alignment. A value of 0 indicates The compiler can also ignore this parameter using typical alignment.

Below we give a few examples to illustrate

Use of __atomic_load_n and __atomic_store_n

#include<pthread.h>
#include<stdio.h>
#include <unistd.h>
#define ATOMIC_GET(x) __atomic_load_n(&(x), __ATOMIC_RELAXED)
#define ATOMIC_SET(x, y) __atomic_store_n(&(x), y, __ATOMIC_RELAXED)
void *thread1(void *a)
{
	int j;
	while(1)
	{
		j=ATOMIC_GET(*(int*)a);
		printf("thread1 %d\n",j);
	}
}
void *thread2(void *a)
{
	int k=0;
	while(1)
	{
		k++;
		ATOMIC_SET(*(int*)a,k);
		printf("thread2 %d\n",k);
	}
}
void main()
{
	int i=0;
	pthread_t id;
	pthread_attr_t attr;
	pthread_attr_init(&attr);
	if(pthread_create(&id,&attr,thread1,(void*)&i))
	{
		printf("wrong");
	}
	if(pthread_create(&id,&attr,thread2,(void*)&i))
	{
		printf("wrong");
	}
	while(1)
	{
		sleep(20);
		printf("main thread");
	}
}

__atomic_add_fetch和__atomic_fetch_add 

#include<pthread.h>
#include<stdio.h>
#include <unistd.h>
#define ATOMIC_PRE_INC(x) __atomic_add_fetch(&(x), 1, __ATOMIC_RELAXED)
#define ATOMIC_POST_INC(x) __atomic_fetch_add(&(x), 1, __ATOMIC_RELAXED)
void *thread1(void *a)
{
	int j,k;
	while(1)
	{
		j=ATOMIC_PRE_INC(*(int*)a);
		printf("PRE_INC %d,%d\n",*(int*)a,j);
		k=ATOMIC_POST_INC(*(int*)a);
		printf("POST_INC %d,%d\n",*(int*)a,k);
		sleep(5);
	}
}
void main()
{
	int i=0;
	pthread_t id;
	pthread_attr_t attr;
	pthread_attr_init(&attr);
	if(pthread_create(&id,&attr,thread1,(void*)&i))
	{
		printf("wrong");
	}
	while(1)
	{
		sleep(20);
		printf("main thread");
	}
}

We found that __atomic_add_fetch will return ptr + val, and __atomic_fetch_add will return ptr directly

__atomic_exchange_n

#include<pthread.h>
#include<stdio.h>
#include <unistd.h>
#define ATOMIC_CLEAR(x) __atomic_store_n(&(x), 0, __ATOMIC_RELAXED)
#define ATOMIC_XCHG(x, y) __atomic_exchange_n(&(x), y, __ATOMIC_RELAXED)
void *thread1(void *a)
{
	while(1)
	{
		ATOMIC_CLEAR(*((int*)a));
		ATOMIC_XCHG(*((int*)a+1),20);
		printf("%d %d,%d\n",*((int*)a),*((int*)a+1),*((int*)a+2));
		sleep(5);
	}
}
void main()
{
	int i[3]={5,6,7};
	printf("%d %d,%d\n",*(i+0),*(i+1),*(i+2));
	pthread_t id;
	pthread_attr_t attr;
	pthread_attr_init(&attr);
	if(pthread_create(&id,&attr,thread1,(void*)i))
	{
		printf("wrong");
	}
	while(1)
	{
		sleep(20);
		printf("main thread");
	}
}

 

 

Published 43 original articles · Like 23 · Visits 30,000+

Guess you like

Origin blog.csdn.net/zhang14916/article/details/102840354