Redis is single-threaded and why is Redis so fast

I. Introduction

Almost all Java-related interviews will ask about caching. The basic ones will ask what is the "28 law", what is "hot data and cold data", and the more complex ones will ask about cache avalanche and cache penetration. , cache preheating, cache update, cache downgrade and other issues, these seemingly uncommon concepts are related to our cache server. Generally, the commonly used cache servers include Redis, Memcached, etc., and the most commonly used by the author is only Redis. A sort of.

If you haven't met an interviewer in your previous interviews and asked you "Why is Redis single-threaded and why is Redis so fast!" ", then when you read this article, you should think it is a very lucky thing! If you happen to be a high-quality interviewer, you can also use this question to interview the "looking through the autumn waters" on the opposite side to test his mastery.

Okay! Get to the point! Let's first discuss what Redis is, why is Redis so fast, and then why is Redis single-threaded?

2. Introduction to Redis

Redis is an open source in-memory data structure storage system that can be used as: database, cache and message middleware.

It supports multiple types of data structures, such as Strings, Hash, List, Set, Sorted Set or ZSet and range queries, Bitmaps, Hyperloglogs and Geospatial index radius query. Among them, the common data structure types are: String, List, Set, Hash, and ZSet.

Redis has built-in replication (Replication), LUA scripting (Lua scripting), LRU driven events (LRU eviction), transactions (Transactions) and different levels of disk persistence (Persistence), and through Redis Sentinel (Sentinel) and automatic partition (Cluster) ) to provide High Availability.

Redis also provides persistence options that allow users to save their own data to disk for storage. According to the actual situation, the data set can be exported to disk (snapshot) at regular intervals, or appended to the command log (AOF only appends files). When executing the write command, it will copy the executed write command to the hard disk. . You can also turn off the persistence function and use Redis as an efficient network cache data function.

Redis does not use tables, and its database does not predefine or force users to associate different data stored in Redis.

The working mode of the database can be divided into: hard disk database and memory database according to the storage mode. Redis stores data in memory, and when reading and writing data, it is not limited by the I/O speed of the hard disk, so it is extremely fast.

(1) Working mode of hard disk database: 
write picture description here 
(2) Working mode of in-memory database: 
write picture description here

After reading the above description, do you know some common Redis-related interview questions, such as: what is Redis, what are the common types of data structures in Redis, how does Redis persist, etc.

3. How fast is Redis

Redis uses a memory-based KV database that uses a single-process and single-threaded model. It is written in C language. The official data is a QPS (queries per second) that can reach 100,000+. This data is no worse than the same memory-based KV database Memcached with single-process multi-threading! If you are interested, you can refer to the official benchmark program " How fast is Redis?" "( https://redis.io/topics/benchmarks )

write picture description here 
The horizontal axis is the number of connections, and the vertical axis is QPS. At this time, this picture reflects an order of magnitude. I hope everyone can describe it correctly during the interview. When I don’t ask you, the order of magnitude of your answer is very different!

4. Why is Redis so fast

1. It is completely based on memory, and most of the requests are pure memory operations, which are very fast. The data is stored in memory, similar to HashMap, the advantage of HashMap is that the time complexity of search and operation is O(1);

2. The data structure is simple and the data operation is also simple. The data structure in Redis is specially designed;

3. Using a single thread avoids unnecessary context switching and competition conditions, and there is no CPU consumption due to switching caused by multi-process or multi-threading. There is no need to consider various lock issues. There is no lock release operation. Performance consumption due to possible deadlocks;

4. Use multiple I/O multiplexing model, non-blocking IO;

5. The underlying models are different, the underlying implementation methods between them and the application protocol for communication with the client are different. Redis directly builds the VM mechanism by itself, because if the general system calls system functions, it will waste a certain amount of time to move. and requests;

The above points are easy to understand. Below we briefly discuss the multi-channel I/O multiplexing model:

(1) Multiplex I/O Multiplexing Model

The multi-channel I/O multiplexing model is the ability to use select, poll, and epoll to monitor the I/O events of multiple streams at the same time. When idle, the current thread will be blocked. When one or more streams have I/O events When the /O event occurs, it wakes up from the blocking state, so the program will poll all the streams (epoll is only polling those streams that actually emit events), and only process the ready streams in sequence. This approach A lot of useless operations are avoided.

Here "multiplexing" refers to multiple network connections, and "multiplexing" refers to multiplexing the same thread. The use of multi-channel I/O multiplexing technology allows a single thread to efficiently process multiple connection requests (minimizing the time consumption of network IO), and Redis operates data in memory very quickly, which means that in-memory operations are not It will become a bottleneck that affects Redis performance. The above points make Redis have high throughput.

5. So why is Redis single-threaded

We must first understand that the above analysis is all to create a fast atmosphere for Redis! The official FAQ said that because Redis is a memory-based operation, the CPU is not the bottleneck of Redis, and the bottleneck of Redis is most likely the size of the machine memory or network bandwidth. Since single-threading is easy to implement, and the CPU will not be the bottleneck, it is logical to use a single-threaded solution (after all, using multiple threads will cause a lot of trouble!).

write picture description here 
You can refer to: https://redis.io/topics/faq

Seeing this, you may cry! I thought that there would be some major technical points to make Redis use a single thread to be so fast, but I didn't expect it to be an official answer that seemed to fool us! However, we can already clearly explain why Redis is so fast, and because it is already fast in single-threaded mode, there is no need to use multi-threading!

However, the way we use a single thread is not able to play the multi-core CPU performance, but we can improve it by opening multiple Redis instances on a single machine!

Warning 1: The single thread we have been emphasizing here is only one thread to process our network requests. A formal Redis Server must be running with more than one thread, so everyone needs to pay attention here! For example, when Redis persists, it will be executed in the form of sub-process or sub-thread (specifically, whether the sub-thread or sub-process will be further studied by the reader); for example, I view the Redis process on the test server, and then find the thread under the process:

write picture description here

The "-T" parameter of the ps command indicates the display thread (Show threads, possibly with SPID column.) The "SID" column indicates the thread ID, and the "CMD" column indicates the thread name.

Warning 2: In the last paragraph of the FAQ in the above figure, it is stated that multi-threading will be supported starting from Redis 4.0, but multi-threaded operations are only performed on certain operations! So whether this article is still single-threaded in future versions needs readers to verify!

Six, pay attention

1. We know that Redis uses the "single-thread-multiplexing IO model" to achieve high-performance in-memory data services. This mechanism avoids the use of locks, but at the same time, this mechanism is more time-consuming in the process of sunion and the like. command will reduce the concurrency of redis. Because it is a single thread, only one operation is going on at the same time. Therefore, time-consuming commands will lead to a decrease in concurrency, not only read concurrency, but also write concurrency. A single thread can only use one CPU core, so multiple instances can be started in the same multi-core server to form a master-master or master-slave form, and time-consuming read commands can be performed entirely in the slave.

The redis.conf items that need to be changed:

pidfile /var/run/redis/redis_6377.pid  #pidfile要加上端口号
port 6377  #这个是必须改的
logfile /var/log/redis/redis_6377.log #logfile的名称也加上端口号
dbfilename dump_6377.rdb  #rdbfile也加上端口号

2. "We can't let the operating system load balance, because we know our own programs better, so we can manually assign CPU cores to them without taking up too much CPU, or let our key processes and a A heap of other processes huddled together.". 
CPU is an important factor, because it is a single-threaded model, Redis prefers fast CPU with large cache, rather than multi-core

On multi-core CPU servers, Redis performance also depends on NUMA configuration and processor binding location. The most obvious effect is that redis-benchmark uses CPU cores randomly. To get accurate results, you need to use the fixed processor tool (tasket is available on Linux). The most effective way is to separate the client and server into two different CPUs to use the L3 cache.

7. Expansion

The following are also several models you should know, and good luck with your interviews!

1. Single-process multi-threading model: MySQL, Memcached, Oracle (Windows version);

2. Multi-process model: Oracle (Linux version);

3. Nginx has two types of processes, one is called the Master process (equivalent to the management process), and the other is called the Worker process (the actual work process). There are two ways to start:

(1) Single-process startup: At this time, there is only one process in the system, which acts as both the role of the Master process and the role of the Worker process.

(2) Multi-process startup: At this time, the system has one and only one Master process, and at least one Worker process is working.

(3) The Master process mainly performs some global initialization work and manages the Worker's work; event processing is carried out in the Worker.

write picture description here

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325301915&siteId=291194637