Thread-safe object life management notes

Writing thread-safe classes is not a difficult task. It is enough to protect the internal state with synchronization primitives, but the life and death of the object cannot be protected by the mutex mutex owned by the object. How to avoid race conditions that may exist during object destruction is a basic problem of C++ multithreaded programming, which can be solved perfectly with the help of shared_ptr and weak_ptr of the boost library. This is also a necessary technology to achieve thread-safe Observer mode

When the destructor encounters multiple threads

Unlike other object-oriented languages, C++ requires programmers to manage the life cycle of objects by themselves. This is very difficult in a multithreaded environment. When an object is seen by multiple threads at the same time, the destruction time of the object will become Ambiguous, race conditions may occur.

When an object is about to be destroyed, how do you know if another thread is executing the member function of this object?

How to ensure that during the execution of a member function, the object will not be destroyed in another thread

Before calling a member function of an object, how do you know whether the object is still or whether its destructor happens to be executed halfway?

Solving these race problems is the basic problem faced by C++ multithreaded programming. The text tries to use shared_ptr to solve these problems once and for all, and solve the spiritual burden of C++ multithreaded programming

1.1.1 Definition of thread safety

It shows correct behavior when accessed by multiple threads at the same time.

No matter how the operating system calls these threads, no matter how the execution order of these threads is intertwined

No additional synchronization or other coordinated actions are required for the calling code

Most C++ classes are not thread-safe, including std::string, std::vector, std::map, etc., because these classes usually need to be locked externally to provide access to multiple threads

MutexLock and MutexLockGuard

In order to facilitate the discussion in the future, first agree on two tool classes. I believe that everyone in the C++ multi-threaded program has realized that I have used similar functional classes.

MutexLock encapsulates the critical section, which is a simple resource class that encapsulates the creation and destruction of mutex. On windows, struct CRITI-CAL_SECTION is reentrant; under linux, pthread_mutex_t is not reentrant by default. MutexLock is generally a data member of other classes.

A thread-safe counter example

It is not too difficult to write a single thread-safe class, only synchronization primitives are needed to protect its internal state. For example, the following simple counter class Counter:

1.2 The creation of objects is simple

A thread-safe class should meet the following three conditions

The construction of the object must be thread-safe. The only requirement is not to disclose the this pointer during the construction, that is, not to register any callback function
in the constructor ; also not to pass this to the cross-threaded object
in the constructor. The last line of the function does not work either

The reason for this is because the object has not been initialized during the execution of the constructor. If this is leaked to other objects, other threads may access the finished product, causing unpredictable consequences.

Don't do this

#include <iostream>
class Observable
{
public:
    void register_(Observable* x);
    virtual ~Observable();
    //纯虚函数是在声明虚函数时被“初始化”为0的函数。声明纯虚函数的一般形式是 virtual 函数类型 函数名 (参数表列) =0;
    virtual void update()=0;
};

class Foo : public Observable{
public:
    Foo(Observable* s)
    {
        s->register_(this); // 错误,非线程安全
    }
    virtual void update();

};
int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

This shows that a two-stage construction, that is, the constructor initialize is sometimes a good way. Although this does not conform to the dogma of C++, there is no choice in multithreading. In addition, since two-stage construction is allowed, the constructor does not need to actively throw an exception. The caller relies on the return value of initialize to determine whether the object is successfully constructed. This function simplifies error handling.

Don’t disclose this even if it’s the last line, because foo may be a base class, and the accumulation is constructed before the derived class. The last line of code that executes Foo::Foo() will continue to execute the constructor of the derived class. At this time, most- The objects of the derived class are still under construction and still unsafe.

Relatively speaking, it is relatively easy to achieve thread safety in the construction of objects. After all, there is less exposure and a return rate of 0. The thread safety of destructuring is not so simple. This is the focus of this chapter.

1.3 Destruction is too difficult

Object destruction, this does not pose a problem in a single thread, at most dangling pointers and wild pointers need to be avoided.

The question is what is a
dangling pointer. A dangling pointer points to an object that has been destroyed or an address
that has been recycled. Case 1:

{
    char *dp = NULL;
    {
        char c;
        dp = &c;
    }
}

Situation 2:

#include <stdlib.h>

void func()
{
    char *dp = (char *)malloc(A_CONST);
    free(dp);         //dp变成一个空悬指针
    dp = NULL;        //dp不再是空悬指针
    /* ... */
}

Situation 3:

int * func ( void )
{
    int num = 1234;
    /* ... */
    return &num;
}

num is a variable based on the stack. When the func function returns, the variable space will be reclaimed. At this time, the space pointed to by the pointer may be overwritten. , I really recommend to see the introduction of the function stack of the drip reverse tutorial is really very good, the article introduces the whole process from the perspective of assembly registers

Wild pointer:

A pointer that is not initialized is called a wild pointer

int func()
{
    char *dp;//野指针,没有初始化
    static char *sdp;//非野指针,因为静态变量会默认初始化为0
}

There are too many race conditions in multithreaded programs. For general member functions, the way to achieve thread safety is to execute them sequentially, rather than concurrently, so that the critical sections of each member function do not overlap. This is obvious, but there is an implicit condition that may not be what everyone thinks of: the mutex used by the member function to protect the critical section must be valid. The destructor destroys this assumption, he will destroy the mutex member variable. ! ! ! ! !

1.3.1 Mutex is not the way

Mutex can only guarantee the execution of functions one after another. Consider the following code, which tries to protect the destructor with a mutex lock:

Foo::~Foo()
{
    MutexLockGuard lock(mutex_);
}

void Foo::update()
{
    MutexLockGuard lock(mutex_);
}

At this point, two threads A and B can see the Foo object x, thread A is destroyed x, and thread B is preparing to call x->update.

Thread A:

delete x;
x = NULL;

Thread B:

if(x)
{
    x->update();
}

Although thread A sets the pointer to NUll after destroying the object, although thread B checks the value of pointer x before calling the member function of x, it still cannot avoid a race condition:

1. Thread A executes to (1) of the destructor, already holds the mutex lock, and will continue to execute.
2. Thread B passes the check of if (x) and is blocked at (2).

What will happen next, only God knows. Because the destructor will destroy mutex_, the position of (2) may be blocked forever, or a core dump may occur, or other worse situations may occur.

This example shows that it is useless to set the pointer to NULL after deleting the object. If a program wants to use this to prevent secondary release, it indicates that there is a logic problem.

1.3.2 Mutex as a data member cannot protect destructor

The previous example has said that mutexLock as a data member of the class can only be used to synchronize the reading and writing of other data members of this class, and it cannot protect safe destruction. Because the declaration period of mutexLock members is at most as long as the object, and the destructuring action can be said to occur after the death of the object. In addition, for accumulation objects, when the accumulation destructor is called, the object part of the derived class has been destructed, so the mutexlock that the accumulation object should have cannot protect the entire destructuring process. Besides, the destructor does not need to be protected, because the destructor is safe only when no other threads can access the object, otherwise a race condition will occur.

In addition, if you read and write two objects of a class at the same time, there is a potential for deadlock. For example, the swap function:

void swap(Counter& a,Counter &b)
{
    MutexLockGuard aLock(a.mutex_); // potential dead lock
    MutexLockGuard bLock(b.mutex_);
    int64_t value = a.value_;
    a.value_ = b.value_;
    b.value_ = value;
}

If thread A executes swap(a,b) while thread B executes swap(b,a); a deadlock may occur. operator=() is similar.

Counter& Counter::operator=(const Counter& rhs) 
{
    if(this == &rhs)
    {
        return *this;
    }

    MutexLockGuard myLock(mutex_);
    // potential dead lock
    MutexLockGuard itsLock(rhs.mutex_);
    value_ = rhs.value_; // 改成 value_ = rhs.value() 会死锁
    return *this;
}

If a function wants to lock multiple variables of the same type, in order to always lock in the same order, we can compare the addresses of the mutex objects, and always lock the mutex with a smaller address first.

If a function wants to lock multiple objects of the same type, in order to ensure that the locks are always locked in the same order, we
can compare the addresses of the mutex objects and always lock the mutex with the smaller address first.

1.4 How difficult is a thread-safe Observer

Whether a dynamically created object is still alive or not can't be seen by looking at the pointer (the reference is also invisible
). The pointer points to a piece of memory. If the object on this piece of memory has been destroyed, it will not be visible at all. I wrote a very simple example:

#include <iostream>
int main() {
    void* ptr = malloc(100);
    free(ptr);
    if(ptr)
    {
        printf("111\n");
    }
    return 0;
}

Can't easily determine whether the pointer is alive

A very simple way is to only create and not destroy. The program uses an object to temporarily store the used object. When applying for a new object next time, if the object is in stock, it will reuse the existing object, otherwise it will create another one. When the object is used up, it is not directly released. , But put it back in the pool. Although this method has many disadvantages, it can at least avoid the problem of pointer invalidation.

This solution has the following problems:

The thread safety of the object pool, how to safely and completely put the object in the pool to prevent the partial return race?

Lock contention caused by global shared data, will this centralized object pool serialize multi-threaded concurrent operations?

If there is more than one type of shared object, then repeat the object pool or use the class template?

Will it cause memory leaks or fragmentation?

Of course, we can also use the proxy mode to process, just add a counter to the corresponding object, and use a proxy object to apply for or release the object

Finally, we can use c++11 smart pointers. It is a magical and efficient, because it can ensure that the pointer is released when it is not in use. It is much more convenient for us to manage

Below I quote the content of smart pointers from c++11

In C++, the management of dynamic memory is done through a bunch of operators: new, allocate space for an object in dynamic memory and return a pointer to the object, we can choose to initialize the object: delete, accept a dynamic Pointer to the object, destroy the object, and release the memory associated with it.

The use of dynamic memory is prone to problems, because it is extremely difficult to ensure that the memory is released at the correct time. Sometimes forget to release memory

In order to make it easier to dynamically use memory, the new standard library uses two smart pointers to manage dynamic objects.

There are two types of smart pointers. Shared_ptr allows many pointers to point to the same object: unique_ptr exclusively points to the object. The standard library also defines a companion class called weak_ptr, which is a kind of weak reference, pointing to the object managed by shared_ptr. These three types are all in the memory header file.

shared_ptr type

Similar to vectors, smart pointers are also templates. Therefore, when we create a smart pointer, we must provide additional information --- the type that the pointer can point to. As with vector, we use angle brackets to give the type, followed by the defined pointer-only name:

12.1.1 shared_ptr type

Similar to vectors, smart pointers are also templates. Therefore, when we create a smart pointer, we must provide additional information ------ the type that the pointer can point to. Like vector, we give the type in angle brackets, followed by the name of the smart pointer defined by the lock

Null by default

#include <stdio.h>
#include <memory>
#include <iostream>
using namespace std;
class A{

};

int main()
{
    shared_ptr<A> d;
    if(d)
    {
        cout<<"not null"<<endl;
    }else{
        cout<<"null"<<endl;
    }
    return 0;
}

Operations supported by both shared_ptr and unique_ptr.

shared_ptr<T> sp 空智能指针,可以指向类型为T的对象
unique_ptr<T> sp

p   将p作为一个对象判断,如果p是一个对象,则为true
*p  解引用p,获得它的指定对象

p->mem 等价于*p

swap(p,q) 交换p和q的指针
p.swap(q)

Shared_ptr unique operations

make_shared<T>(args) 返回一个shared_ptr,指向一个动态分配的类型为T的对象。使用args初始化此对象

shared_ptr<T>p(a) p是shared_ptr q的拷贝;此操作会递增q中的计数器

p=q p和q都是都是share_ptr,所保存的指针必须相互转换。此操作,会递减p的引用计数,递增q的引用计数;若p的引用计数变为0,则将其管理的原内存释放

p.unique 若p.user_count()为1,返回true,否则 返回false
p.user_count 返回共享对象智能指针的数量

make_shared function

最安全的分配和使用动态内存的方法是调用一个名为make_shared的标准库函数。次函数在动态内存中分配一个对象,并且初始化他,返回指向这个对象的shared_ptr。与智能指针一样,他在memory里。

当要用make_shared的时候,必须要指定想要创建的对象类型。我们可以使用make_shared进行赋值
#include <stdio.h>
#include <memory>
#include <iostream>
#include <cstring>

using namespace std;
class A{

};

int main()
{
    shared_ptr<int> data = make_shared<int>(42);
    cout<<*data<<endl;
    return 0;
}

Copy and assignment of shared_ptr:

当进行拷贝或赋值的时候,每个shared_ptr都有一个关联的的计数器,通常称为引用计数。

每一个shared_ptr都有一个引用计数,无论何时我们拷贝一个shared_ptr,计数器都会增加。例如当我们使用shared_ptr初始化另一个shared_ptr的时候,或将他作为参数 传递给一个函数以及作为函数值返回的时候,他所关联的计数器都会增加。当我们给shared_ptr设置一个新值,或者shared_ptr离开作用域的时候计数器都会递减。

一旦shared_ptr引用技术为0,就会被自动释放掉。

A piece of code:

#include <memory>
using namespace std;
int main()
{
    shared_ptr<int> q;
    auto r = make_shared<int>(42);
    r = q;
    printf("%d\n",*r);
}

We found that a core dump appeared. Because r=q, the counter of r is decreased by one, and the counter of r is 0 and released.

shared_ptr will automatically release the associated memory

shared_ptr<Foo> factory(int arg)
{
    return make_shared<Foo>(arg);
}

Dynamic memory is used for three reasons:

1. The program does not know how many objects it uses
2. The program does not know the exact type
it needs 3. The program needs to share multiple objects

12.1.3 Combination of new and shared_ptr

int main()
{
    auto data = shared_ptr<int>(new int(32));
    printf("%d\n",*data);
}

reset resets the counter and value

int main()
{
    auto data = shared_ptr<int>(new int(32));
    data.reset();
    printf("%d\n",*data);
}

Define your own destructor

#include <memory>
using namespace std;

class Foo{

};
void end_data(Foo* a)
{
    printf("1111\n");
}

int main()
{
    shared_ptr<Foo> p1(new Foo,end_data);

    //使用定制的deleter创建shared_ptr

    return 0;
}

unique_ptr pass deleter

#include <memory>
using namespace std;

class Foo{
public:
    Foo(int a)
    {

    }
};
void end_data(Foo* a)
{
    printf("1111\n");
}

int main()
{
    unique_ptr<Foo,void(*)(Foo*)> p1(new Foo(3),end_data);

    return 0;
}

Note that unique_ptr cannot be copied or assigned

weak_ptr does not change the reference counter of the shared_ptr object, but it can let you know whether the object is still alive.

Two very critical operations

#include <memory>
using namespace std;

class Foo{
public:
    int b;
    Foo(int a)
    {
        b = a;
    }

    ~Foo()
    {
        printf("333\n");
    }
};
void end_data(Foo* a)
{
    printf("1111\n");
}

int main()
{
    shared_ptr<Foo> p1 = make_shared<Foo>(3);
    weak_ptr<Foo> p2;
    p2 = p1;
    printf("%d\n",p2.expired());
    return 0;
}

Does not change the reference count, but can determine whether the object is alive

allocator allocates n uninitialized strings

//All release requires while loop data++

int main()
{
    allocator<Foo> alloc;
    Foo* data = alloc.allocate(10);
    alloc.construct(data,1);
    alloc.deallocate(data,10);
    return 0;
}

(3) Allocator class algorithm

1) uninitialized_copy(begin,end,begin2);//Copy the input range represented by the iterator begin1end (post-end iterator) to the memory at the beginning of begin2, and the memory pointed to by begin2 must be greater than that required by beginend;

2) uninitialized_copy_n(begin,n,begin2);//Copy n from the element pointed to by the iterator b to the memory space starting from begin2

3) uninitialized_fill(begin,end,t);//build a copy of t in the range of the iterator begin~end;

4) uninitialized_fill_n(begin,n,t);//build n copies of t from the memory starting at begin;

Guess you like

Origin blog.csdn.net/qq_32783703/article/details/104346242