Commonly used high concurrency network thread model design and mongodb thread model optimization practice

Foreword:
    The server usually needs to support high concurrent business access. How to design an excellent server network IO worker thread/process model plays a crucial role in the high concurrent access requirements of the business.

     This paper summarizes a variety of network IO thread/process models in different scenarios, and gives the advantages and disadvantages of various models and their performance optimization methods, which are very suitable for developers such as server development, middleware development, and database development.

Commonly used high concurrency network thread model design and mongodb thread model optimization practice (the most complete high concurrency network IO thread model design)

1. Thread model 1. Single-threaded network IO multiplexing model

1.1 Description:

   1. All network IO events (accept events, read events, write events) are registered to the epoll event set

   2. In the main loop, epoll_wait is used to obtain the epoll event information collected in the kernel state at one time, and then poll and execute the callback corresponding to each event.

   3. Event registration, epoll_wait event acquisition, and event callback execution are all executed by one thread

1.2 Defects of the network threading model

    1. All work is performed by one thread. As long as the event callback processing of any one request is blocked, other requests will be blocked. For example, the hash structure of redis, if there are too many files, for example, a hash key contains millions of files, when the hash key expires, the entire redis is blocked.

   2. In the single-threaded work model, the CPU will become the bottleneck. If the QPS exceeds 100,000, the entire CPU load will reach 100%.

1.3 Typical cases

    1. redis cache

1.4 Main loop workflow:

while (1) {
	//epoll_wait等待网络事件,如果有网络事件则返回,或者超时范围
	size_t numevents=  epoll_wait();

	//遍历前面epoll获取到的网络事件,执行对应事件回调
	for (j = 0; j < numevents; j++) {
         if(读事件) {
            //读数据
	        readData()
            //解析
            parseData()
	        //读事件处理、读到数据后的业务逻辑处理
	        requestDeal()
         } else if(写事件) {
	        //写事件处理,写数据逻辑处理
	        writeEentDeal()
         } else {
             //异常事件处理
             errorDeal()
         }
	 }
}

Description: In the subsequent multi-thread/process model, the main process of each thread/process is the same as the while() process.

2. Thread model 2. Single listener + fixed worker thread

  1. The listener thread is responsible for accepting all client connections

  2. Each time the listener thread receives a new client link, a new fd is generated, and then sent to the corresponding worker thread through the distributor (hash method)

  3. After the worker thread obtains the corresponding new link fd, all subsequent network IO reads and writes on the link are processed by the thread

  4. Assuming that there are 32 links, after the 32 links are successfully established, each thread handles the reading and writing, message processing, and business logic processing on an average of 4 links.

1.5 Redis source code analysis and asynchronous network IO multiplexing Lite demo

Due to the needs of the previous work, the redis kernel needs to be re-optimized and developed. Therefore, some code comments are made to the entire redis code. At the same time, the network module of redis is isolated and made into a simple demo. The demo is useful for understanding epoll network event processing and Io It will be helpful to reuse the implementation, the code is relatively short, you can refer to the following address:

Detailed annotation analysis of redis source code

redis network module lite demo

Twitter cache middleware twemproxy source code analysis implementation

2.1 Defects of the network threading model

   1. There is only one listener thread for accept processing, which can easily become a bottleneck in an instantaneous high concurrency scenario
   . 2. One thread processes data reading and writing, message parsing, and subsequent business logic processing of multiple linked fds through IO multiplexing. This process will There is a serious queuing phenomenon. For example, the internal processing time of a certain link's message after receiving and parsing is too long, and the requests of other links will block the queuing.

2.2 Typical cases

   The memcache cache is suitable for cache scenarios where internal processing is relatively fast, and proxy intermediate scenarios. Chinese analysis of memcache source code implementation can be found in: Memcache source code implementation analysis

 

3. Thread model 3. Fixed worker thread model

   The prototype diagram of the model is as follows:

illustrate:

   1. Linux kernel 3.9 began to support reuseport function. Every time a new link is obtained by the kernel protocol stack, it is automatically distributed to user-mode worker threads in a balanced manner.

   2. This model solves the single-point bottleneck problem of the listener of Model 1

3.1  Defects of the network thread model

      After reuseport is supported, the kernel distributes different new links to multiple user-mode worker processes/threads through load balancing , and each process/thread processes the data read, write and message parsing of the new link fd of multiple clients through IO multiplexing . , parsed business logic processing . Each worker process/thread handles requests for multiple links at the same time . If the internal processing time of a link's message after receiving and parsing is too long , requests from other links will be blocked and queued .

     Although this model solves the single-point bottleneck problem of the listener , the queuing problem inside the worker thread is not solved.

     However, Nginx, as a seven-layer forwarding agent, is suitable for this model because it is processed in memory, so the internal processing time is relatively short.

3.2  Typical cases

    1. nginx (nginx uses a process, the model principle is the same), this model is suitable for scenarios with simple internal business logic, such as nginx proxy, etc.

    2. The reuseport supports the performance improvement process, please refer to my other sharing:      https://my.oschina.net/u/4087916/blog/3016162

    Application of Nginx multi-process high concurrency, low latency and high reliability mechanism in cache (redis, memcache) twemproxy proxy

    Analysis of Chinese annotations of nginx source code

4. Thread Model 4. One Link One Thread Model

    The thread model diagram is as follows:

illustrate:

   1. The listener thread is responsible for accepting all client connections

   2. The listener thread creates a thread every time it receives a new client link. The thread is only responsible for processing data reading and writing, message parsing, and business logic processing on the link.

4.1 Defects of the network thread model:

   1. A link creates a thread. If there are 100,000 links, then 100,000 threads are needed. If the number of threads is too many, the system is responsible and the memory consumption will be a lot.

   2. When the link is closed, the thread also needs to be destroyed, and the frequent thread creation and consumption further increases the system load

4.2 Typical cases:

   1. mysql default mode, mongodb synchronization thread model configuration, suitable for scenarios where request processing is time-consuming, such as database services

   2. Apache web server, this model limits the performance of apache, and the advantages of nginx will be more obvious

5. Thread model 5. Single listener + dynamic worker thread (single queue)

    The thread model diagram is shown in the following figure:

illustrate:

  1. After the listener thread receives a new link fd, it will hand over the fd to the thread pool for processing. All subsequent reading and writing, message parsing, and business processing of the link are processed by multiple threads in the thread pool.

  2. This model converts a request into multiple tasks (network data reading and writing, message parsing, and business logic processing after message parsing) into the global queue, and the threads in the thread pool obtain tasks from the queue for execution.

  3. The same request access is divided into multiple tasks, and one request may be processed by multiple threads.

 4. When there are too many tasks and the system is under high pressure, the number of threads in the thread pool increases dynamically

 5. When the task is reduced and the system pressure is reduced, the number of threads in the thread pool is dynamically reduced

5.1 Several statistics related to the running time of worker threads:

     T1: The time to call the underlying asio library to receive a complete mongodb message

     T2: All subsequent processing after receiving the message (including message parsing, authentication, engine layer processing, sending data to the client, etc.)

     T3: The time the thread waits for data (for example: if there is no traffic for a long time, it is now waiting to read data)

5.2 How a single worker thread determines that it is in an "idle" state:

       Total thread running time = T1 + T2 + T3, where T3 is the useless waiting time. If the useless waiting time of T3 accounts for a large proportion, it means that the thread is relatively idle. The worker thread judges the effective time ratio after each loop processing. If it is less than the specified threshold, it will directly exit and exit the destruction.

5.3 How to judge that the worker threads in the thread pool are "too busy":

      The control thread is specially used to judge the pressure of the worker threads in the thread pool, so as to decide whether to create a new worker thread in the thread pool to improve performance.

     The control thread cyclically checks the thread pressure status in the thread pool after a certain period of time. The implementation principle is to simply record the current running status of the threads in the thread pool in real time, and count the following two types: the total number of threads_threadsRunning, and the number of tasks currently running. Number of threads _threadsInUse. If _threadsRunning=_threadsRunning, it means that all worker threads are currently processing task tasks, and the thread pressure in the thread pool is high. At this time, the control thread starts to increase the number of threads in the thread pool. For more details on the detailed source code implementation process of the model, see: https://my.oschina.net/u/4087916/blog/4295038

5.4 Defects of the network thread model:

     1. The thread pool acquires task execution, and there is lock competition, which will become the system bottleneck

5.5 Typical cases:

5.5 Typical cases:

    The mongodb dynamic adaptive thread model is suitable for scenarios where request processing is time-consuming, such as database services

    The detailed source code optimization analysis and implementation process of this model refer to:

     https://my.oschina.net/u/4087916/blog/4295038

     Mongodb network transmission processing source code implementation and performance tuning - experience the ultimate design of kernel performance

 

6. Thread model 6. Single listener + dynamic worker thread (multiple queues)

     The thread model diagram is as follows:

illustrate:

        Divide a global queue into multiple queues. When tasks are enqueued, they are hashed to their respective queues according to their hash. When a worker thread obtains a task, it similarly obtains tasks from the corresponding queue by hashing. In this way Reduce lock contention while improving overall performance.

6.1 Typical case:

        OPPO's self-developed mongodb kernel multi-queue adaptive thread model has been optimized, and the performance has been greatly improved. It is suitable for scenarios where request processing is time-consuming, such as database services. The detailed source code optimization analysis and implementation process of this model refer to: https://my.oschina.net/u/4087916/blog/4295038

 

 

 

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324133335&siteId=291194637