Isn't Redis always claiming that single-threaded efficiency is also very high? Why does it use multi-threading again?

Redis is a well-known in-memory database at present, and it has very rich applications in various scenarios. Some time ago, Redis launched the 6.0 version, and adopted the multi-threaded model in the new version.

Because the in-memory database used by our company is self-developed, it stands to reason that I don't pay much attention to Redis, but because Redis is widely used, I need to understand this to facilitate my interview.

Can't candidates have used Redis, but I have to ask others what is going on with Ali's Tair.

So, after the release of Redis 6.0, I want to understand why multithreading is used. What is the difference between the multithreading used now and the previous version? Why is multithreading used so late?

Hasn't Redis already adopted multiplexing technology? Isn't it high performance? Why use a multi-threaded model?

This article will analyze these problems and the thinking behind them.

Why was Redis designed to be single-threaded in the first place?

As a mature distributed cache framework, Redis is composed of many modules, such as network request module, index module, storage module, high-availability cluster support module, data operation module, etc.

Many people say that Redis is single-threaded, and they think that the operations of all modules in Redis are single-threaded. In fact, this is wrong.

What we call Redis single-threaded refers to "the read and write of its network IO and key-value pairs are completed by one thread", that is to say, only the network request module and data operation module in Redis are single-threaded. Others, such as persistent storage modules and cluster support modules, are multi-threaded.

So, Redis is not without a multi-threading model. As early as Redis 4.0, some commands have been multi-threaded.

So, why didn't the network operation module and the data storage module use multithreading in the first place?

The answer to this question is relatively simple! Because: "No need!"

Why is it unnecessary? Let's talk about it first, under what circumstances do we need to use multithreading?

Multi-thread application scenarios

During the execution of a computer program, there are two main operations that need to be performed: read and write operations and calculation operations.

Among them, read and write operations are mainly related to I/O operations, including network I/O and disk I/O. The calculation operation mainly involves the CPU.

The purpose of multithreading is to improve the utilization of I/O and the utilization of CPU in a concurrent manner.

So, does Redis need to use multi-threading to improve the utilization of I/O and the utilization of CPU?

First of all, we can say with certainty that Redis does not need to improve CPU utilization, because Redis operations are basically based on memory, and CPU resources are not the performance bottleneck of Redis at all.

Therefore, it is completely unnecessary to improve Redis's CPU utilization through multi-threading technology.

So, how about using multi-threading technology to improve Redis I/O utilization? Is it necessary?

Redis is indeed a framework with intensive I/O operations. During its data operations, there will be a lot of network I/O and disk I/O. If you want to improve the performance of Redis, you must increase the I/O utilization rate of Redis. There is no doubt about this.

However, improving I/O utilization is not the only way to use multi-threading technology!

Disadvantages of multithreading

We have introduced some multi-threading technologies in Java in many articles, such as memory model, lock, CAS, etc. These are some technologies provided in Java to ensure thread safety in the case of multi-threading.

Thread safety: It is a term in programming, which refers to the ability to correctly handle shared variables between multiple threads when a function or function library is called in a concurrent environment, so that the program function can be completed correctly.

Similar to Java, all programming languages ​​or frameworks that support multi-threading have to face a problem, that is, how to solve the problem of concurrency control of shared resources brought about by the multi-threaded programming model.

Although the use of multithreading can help us improve the utilization of CPU and I/O, the concurrency problems caused by multithreading also bring more complexity to these languages ​​and frameworks. Moreover, in the multi-threaded model, switching between multiple threads will also bring a certain performance overhead.

Therefore, in terms of improving I/O utilization, Redis does not use multi-threading technology, but chooses multiplexed I/O technology.

summary

Redis does not use the multi-threading model in the network request module and data operation module, mainly based on the following four reasons:

  • 1. Redis operations are based on memory, and the performance bottleneck of most operations is not in the CPU
  • 2. Using a single-threaded model, the maintainability is higher, and the cost of development, debugging and maintenance is lower
  • 3. Single-threaded model avoids the performance overhead caused by switching between threads
  • 4. Using multiplexed I/O technology in a single thread can also improve the I/O utilization of Redis

Still have to remember: Redis is not completely single-threaded, but the key network IO and key-value pairs are read and written by one thread.

Redis multiplexing

I believe many people are familiar with the term multiplex. I have mentioned this term enough in many of my previous articles.

Among them, we mentioned it when we introduced the Linux IO model, and when we introduced the principle of HTTP/2, we also mentioned it.

So, what is the difference between Redis's multiplexing technology and the one we introduced before?

Let me talk about Linux multiplexing technology first , that is, the IO of multiple processes can be registered on the same pipe, and this pipe will interact with the kernel in a unified manner. When the data required by a certain request in the pipeline is ready, the process copies the corresponding data to the user space.

Read the picture above and the sentence above again, you may use it later.

In other words, multiple IO streams are processed by one thread.

IO multiplexing includes three types under Linux, select, poll, and epoll. From an abstract point of view, their functions are similar, but the specific details are different.

In fact, all the functions of Redis's IO multiplexing program are realized by packaging the IO multiplexing function library of the operating system. Each IO multiplexing function library has a corresponding separate file in the Redis source code.

In Redis, every time a socket is ready to perform operations such as connection response, writing, reading, and closing, a file event is generated. Because a server usually connects to multiple sockets, multiple file events may appear concurrently.

Once a request arrives, it will be handed over to the Redis thread for processing, which achieves the effect of one Redis thread processing multiple IO streams.

Therefore, Redis chose to use multiplexed IO technology to improve I/O utilization.

The reason why Redis can have such high performance is not only related to the use of multiplexing technology and single thread, but also the following reasons:

  • 1. Based entirely on memory, most of the requests are pure memory operations, very fast.

  • 2. The data structure is simple, and the data operation is also simple, such as hash tables and jump tables have high performance.

  • 3. Single thread is used to avoid unnecessary context switching and competition conditions, and there is no switching caused by multi-process or multi-thread to consume CPU

  • 4. Use multiple I/O multiplexing model

Why Redis 6.0 introduces multithreading

In May 2020, Redis officially launched version 6.0. This version has many important new features, among which the multithreading feature has attracted widespread attention.

However, I need to remind everyone that the multi-threading in Redis 6.0 only uses multi-threading for processing network requests, and the data read and write commands are still single-threaded.

However, I don’t know if anyone has this question:

Doesn't Redis claim to be single-threaded with high performance?

Doesn't it mean that multiplexing technology has greatly improved the IO utilization rate, why do we need multi-threading?

Mainly because we have higher requirements for Redis.

According to calculations, Redis puts all data in memory, and the response time of the memory is about 100 nanoseconds. For small data packets, the Redis server can process 80,000 to 100,000 QPS. For 80% of companies, single-threaded Redis is enough to use.

However, with more and more complex business scenarios, some companies have transaction volumes of hundreds of millions at every turn, and therefore require greater QPS.

In order to improve QPS, many companies deploy Redis clusters and increase the number of Redis machines as much as possible. But the resource consumption of this approach is huge.

After analysis, the main bottleneck limiting the performance of Redis appears in the processing of network IO, although multiplexing technology has been used before. But as we mentioned earlier, the multiplexed IO model is still essentially a synchronous blocking IO model .

The following is the processing process of the select function in multiplexed IO:

From the above figure, we can see that in the multiplexed IO model, when processing network requests, the process of calling select (other functions are the same) is blocked, that is to say, this process will block the thread, if the amount of concurrency Very high, here may become a bottleneck.

Although many servers now have multiple CPU cores, for Redis, because it uses a single thread, a large amount of CPU time slices are spent on the synchronization of network IO during a data operation. It has not fully exploited the advantages of multi-core.

If multiple threads can be used to make network processing requests go on concurrently, performance can be greatly improved. In addition to reducing the impact caused by network I/O waiting, multithreading can also make full use of the multi-core advantages of the CPU.

Therefore, Redis 6.0 uses multiple IO threads to process network requests. The analysis of network requests can be completed by other threads, and then the parsed request is handed over to the main thread for actual memory read and write. Improve the parallelism of network request processing, thereby improving overall performance.

However, Redis's multi-IO threads are only used to process network requests. For read and write commands, Redis still uses a single thread to process.

So, after the introduction of multithreading, how to solve the thread safety problem caused by concurrency?

This is why we mentioned many times before that "Redis 6.0's multi-threading is only used to process network requests, while data reading and writing are still single-threaded.

Redis 6.0 only uses multi-threading when receiving and parsing network requests, and when the requested data is returned through the network. The data read and write operations are still done by a single thread, so there will be no concurrency problems.

Reference materials:

https://www.cnblogs.com/Zzbj/p/13531622.html https://xie.infoq.cn/article/b3816e9fe3ac77684b4f29348 https://jishuin.proginn.com/p/763bfbd2a1c2 "Geek Time: Redis Core" Technology and actual combat

Guess you like

Origin blog.csdn.net/hollis_chuang/article/details/114819377