Why RocketMQ has high performance

This article mainly considers the implementation of RocketMQ from the perspective of performance.

Overall structure

This is a cluster deployment diagram of the popular RocketMQ on the Internet.

RocketMQ is a cluster mainly composed of Broker, NameServer, Producer and Consumer.

**NameServer: The registration center and configuration center of the entire cluster, which manages the metadata of the cluster. Including Topic information and routing information, client registration information of Producer and Consumer, and registration information of Broker. **Broker: Responsible for receiving message production and consumption requests, and performing message persistence and message reading. **Producer: Responsible for producing messages. **Consumer: Responsible for consuming messages. In the process of actually producing and consuming messages, NameServer provides Meta data for producers and consumers to determine which Broker the message should be sent to or from which Broker should pull the message. With Meta data, producers and consumers can directly interact with Broker. This point-to-point interaction method minimizes the intermediate links of message transmission and shortens the link time.

network model

RocketMQ uses the Netty framework to achieve high-performance network transmission.

Implementation of network communication module based on Netty

Netty main features

** With a unified API, users do not need to care about the programming model and concepts of NIO. Through Netty's ChannelHandler, the communication framework can be flexibly customized and expanded. **Netty encapsulates the basic functions involved in network programming: unpacking and unpacking, anomaly detection, and zero-copy transmission. **Netty solves NIO's Epoll Bug, avoiding 100% CPU utilization caused by empty polling. **Support multiple Reactor threading models. ** It is widely used and has good support from the open source community. Projects such as Hadoop, Spark, and Dubbo all integrate Netty. **The embodiment of Netty's high-performance transmission

Non-blocking IO Ractor threading model **Zero copy. Use FileChannel.transfer to avoid copy operation between user mode and kernel mode; combine multiple ByteBuffers through CompositeByteBuf; obtain slices of ByteBuffer through slice; encapsulate common ByteBuffer into netty.ByteBuffer through wrapper. **RocketMQ network model

The Broker side of RocketMQ implements the master-slave Reactor model based on Netty. The structure is as follows:

specific process:

eventLoopGroupBoss, as an acceptor, is responsible for receiving client connection requests. eventLoopGroupSelector is responsible for NIO read and write operations . NettyServerHandler reads IO data and parses message headers. The dispatch process distributes different events to different threads according to the registered message code and processsor. Maintained by processTable (type HashMap) business thread pool isolation

RocketMQ finely isolates Broker's thread pool. So that requests such as message production, consumption, client heartbeat, and client registration will not interfere with each other. The following is the corresponding relationship between each business execution thread pool and the message types processed by the Broker. From the figure below, we can also see the core functions of the Broker.

production of news

RocketMQ supports three message sending methods: synchronous sending, asynchronous sending and One-Way sending. When One-Way is sent, the client cannot determine whether the server message is delivered successfully, so it is an unreliable sending method.

Sequence Diagram of Client Sending

Flow Description

**The client API calls the send method of DefaultMQProducer to send the message. **makeSureStateOk Check whether the sending service of the client is ok. The RocketMQ client maintains a single instance of MQClientInstance, which can manage related network services through start and shutdown. **tryToFindTopicPublishInfo is used to obtain the Meta information of the Topic, mainly an optional MessageQueue list. **selectOneMessageQueue routes to a specific MessageQueue according to the current fault tolerance mechanism. **The core method of sendKernelImpl is to call the sendMessage method of NettyRemotingClient. In this method, it will be processed differently according to the sending strategy selected by the user. The sequence diagram only shows the method of synchronous sending. **invokeSync sends the byte stream of the message to the TCP Socket buffer by calling Netty's channel.writeAndFlush, so far the client message is sent. The difference in the implementation of the three sending methods

**Synchronous sending: register ResponseFuture to responseTable, send Request request, and wait for Response to return synchronously. **Asynchronous sending: register ResponseFuture to responseTable, send Request request, no need to wait for Response to return synchronously, when Response returns, the registered Callback method will be called, so as to obtain the sent result asynchronously. **One-Way: Send a Request request without waiting for the Response to return, and without triggering the Callback method callback. **Client failure fault tolerance mechanism

MQFaultStrategy implements RT-based time-consuming fault tolerance strategy. When the RT of a Broker is too large, it is considered that there is a problem with the Broker, and the Broker will be disabled for a period of time. The corresponding relationship between latencyMax and notAvailableDuration is as follows:

Client Efficient Sending Summary

The One-Way sending method is the most efficient. It does not require a synchronous waiting process or additional CallBack call overhead, but the message sending is unreliable. The MQClientInstance singleton mode uniformly manages and maintains the network channel. Before sending a message, it only needs to do a service status availability check. The Meta information of the topic is cached locally to avoid pulling Meta data from the NameServer every time a message is sent. Efficient fault-tolerant mechanism to ensure fast retransmission when the message fails to be sent. Broker receiving message sequence diagram

Flow Description

**Broker receives the request whose RequestCode is SEND_MESSAGE through Netty, and hands the request to SendMessageProcessor for processing. **SendMessageProcessor first parses out the message header information (Topic, queueId, producerGroup, etc.) in the SEND_MESSAGE message, and calls the storage layer for processing. **putMessage judges whether the current write condition is met: Broker status is running; Broker is the master node; disk status is writable (writable when the disk is full); Topic length is not exceeded; message attribute length is not exceeded; pageCache is not in a busy state (pageCachebusy is based on the time it takes for putMessage to write mmap. If it takes more than 1 second, it means that the page load is slow due to missing pages, and pageCache is determined at this time. che busy, denying write). **Select the MappedFile that has been warmed up from the MappedFileQueue. **AppendMessageCallback executes the message operation doAppend, and directly writes to the bytbuffer of the file after mmap. **Broker-side optimization of write performance

Spinlocks reduce context switching

RocketMQ's CommitLog uses a PutMessageLock to avoid concurrent writing. There are 2 implementations of PutMessageLock: PutMessageReentrantLock and PutMessageSpinLock.

PutMessageReentrantLock is a java-based synchronous waiting wake-up mechanism; PutMessageSpinLock uses Java's CAS primitives to realize locking and unlocking through spin setting. RocketMQ uses PutMessageSpinLock by default to improve the efficiency of locking and unlocking during high concurrent writing and reduce the number of thread context switches.

MappedFile warm-up and zero-copy mechanism

RocketMQ message writing is sensitive to delay. In order to avoid the load overhead caused by the CommitLog file not being opened or the file not being loaded into the memory when writing the message, RocketMQ implements the file preheating mechanism.

When the Linux system writes data, it does not directly write the data to the disk, but writes it to the PageCache corresponding to the disk, and marks the page as a dirty page. When the dirty pages accumulate to a certain extent or after a certain period of time, the data is flushed to the disk (of course, if the system is powered off during this period, the dirty page data will be lost). The key code for RocketMQ to realize file preheating is as follows:

publicvoidwarmMappedFile(FlushDiskType type, int pages){ByteBuffer byteBuffer = this.mappedByteBuffer.slice();int flush = 0;long time = System.currentTimeMillis();for (int i = 0, j = 0; i < this.fileSize; i += MappedFile.OS_PAGE_SIZE, j++) {byteBuffer.put(i, (byte) 0);// force flush when flush disk type is syncif (type == FlushDiskType.SYNC_FLUSH) {if ((i / OS_PAGE_SIZE) - (flush / OS_PAGE_SIZE) >= pages) {flush = i;mappedByteBuffer.force();}}…}// force flush when prepare load finishedif (type == FlushDiskType.SYNC_FLUSH) {log.info(“mapped file warm-up done, force to disk, mappedFile={}, costTime={}”,this.getFileName(), System.currentTimeMillis() - beginTime);mappedByteBuffer.force();}this.mlock();}代码分析

**Mmap the file. **Write a byte every other PAGE_SIZE to the entire file. If it is a synchronous flash disk, a forced flash disk is performed every time a byte is written. **Call the mlock function of libc to lock the memory area where the file is located. (The mlock family of system calls allows a program to lock part or all of its address space in physical memory. This will prevent Linux from paging this memory to swap space, even if the program has not accessed this space for a while). **Synchronous and asynchronous brushing

RocketMQ provides two mechanisms for synchronous and asynchronous brushing. By default, the asynchronous brushing mechanism is used.

When CommitLog receives the result that the MappedFile successfully appends the message to the memory in putMessage(), it will call the handleDiskFlush() method to flush the disk and store the message in the file. handleDiskFlush() will call different flushing services according to the two flushing strategies.

The abstract class FlushCommitLogService is responsible for flushing operations, and there are three implementations of this abstract class:

GroupCommitService: Synchronous flushing FlushRealTimeService: Asynchronous flushing CommitRealTimeService: Asynchronous flushing and opening TransientStorePool Each implementation class is a ServiceThread implementation class. ServiceThread can be regarded as a background thread service that encapsulates basic functions. It has complete life cycle management, supports start, shutdown, weakup, waitForRunning.

Synchronous brushing process

All flush operations are processed by the GroupCommitService thread. The thread currently receiving messages encapsulates a GroupCommitRequest and submits it to the GroupCommitService thread. Then the current thread enters a CountDownLatch wait. **Once a new task comes in, the GroupCommitService will be awakened immediately and call MappedFile.flush to flush the disk. The bottom layer is to wake up the waiting receiving message thread after calling mappedByteBuffer.force ()**flush. So as to complete the synchronous disk brushing process and the asynchronous disk brushing process

**RocketMQ performs a flush operation every 200ms (persisting data to disk). When a new message is written, it will actively wake up the flush thread to flush the disk. The current thread receiving the message does not need to wait for the result of the flush. **Message consumption

A high-performance message queue should ensure the fastest message turnover efficiency: that is, a message sent by the sender should be delivered to the consumer of the message as soon as possible after being processed by the Broker.

message storage structure

The biggest features of RocketMQ's storage structure:

**All message writing is converted to sequential writing (compared to Kafka, RocketMQ can handle topics above 1w+) **Read and write files are separated. Generate ConsumeQueue through ReputMessageService service

Structure description

**ConsumeQueue is different from CommitLog, and uses a fixed-length storage structure, as shown in the figure below. In order to achieve fixed-length storage, ConsumeQueue stores the Hash Code of the message Tag. When filtering messages on the Broker side, it decides whether to consume the message by comparing whether the Hash Code of the Consumer subscription Tag is consistent with the Tag Hash Code in the storage entry. **ReputMessageService reads the CommitLog file continuously and generates ConsumeQueue. Sequential consumption and parallel consumption

The biggest difference between serial consumption and parallel consumption is the order of messages in the consumption queue. Sequential consumption guarantees the sequentiality of consumption in the same Queue. The sequentiality of RocketMQ depends on the implementation of partition locks. There are two modes of message consumption: push and pull. We only consider this mode here.

parallel consumption

**The implementation class of parallel consumption is ConsumeMessageConcurrentlyService. **PullMessageService has a built-in scheduledExecutorService thread pool, which is mainly responsible for processing PullRequest requests, pulling the latest messages from the Broker side and returning them to the client. The fetched messages will be put into the ProcessQueue corresponding to the MessageQueue. **ConsumeMessageConcurrentlyService encapsulates the received message into a ConsumeRequest, and delivers it to the built-in consumeExecutor independent thread pool for consumption. **ConsumeRequest calls MessageListener.consumeMessage to execute user-defined consumption logic and return consumption status. **If the consumption status is SUCCESS. Then delete the message in ProcessQueue and submit offset. **If the consumption status is RECONSUME. The message is sent to the delay queue for retrying, and the currently failed message is delayed. serial consumption

**The implementation class of serial consumption is ConsumeMessageOrderlyService. **PullMessageService has a built-in scheduledExecutorService thread pool, which is mainly responsible for processing PullRequest requests, pulling the latest messages from the Broker side and returning them to the client. The fetched messages will be put into the ProcessQueue corresponding to the MessageQueue. **ConsumeMessageOrderlyService encapsulates the received message into a ConsumeRequest, and delivers it to the built-in consumeExecutor independent thread pool for consumption. **When consuming, first obtain the objectLock corresponding to the MessageQueue to ensure that only one thread in the current process is processing the corresponding MessageQueue, and fetch messages from the msgTreeMap of the ProcessQueue in order of offset from low to high, thereby ensuring the order of the messages. **ConsumeRequest calls MessageListener.consumeMessage to execute user-defined consumption logic and return consumption status. **If the consumption status is SUCCESS. Then delete the message in ProcessQueue and submit offset. **If the consumption status is SUSPEND. Determine whether the maximum number of retries has been reached. If the maximum number of retries has been reached, the message will be delivered to the dead letter queue, and the next consumption will continue; otherwise, the number of message retries + 1 will continue to retry after a period of delay. ** It can be seen that if a message cannot be consumed successfully in serial consumption, it will cause congestion, and in severe cases, it will cause message accumulation and related business abnormalities.

PullMessage long connection implementation on the broker side

The messages in the message queue are generated by business triggers. If periodic polling is used, messages cannot be guaranteed to be fetched every time, and too fast or too slow polling frequency will have a serious impact on message delay. Therefore, RockMQ uses a persistent connection on the Broker side to process PullMessage requests. The specific implementation process is as follows:

**PullRequest has a parameter brokerSuspendMaxTimeMillis, the default value is 15s, which controls the duration of the request hold. **After receiving the Request, the PullMessageProcessor parses the parameters and verifies the Meta information of the Topic and the subscription relationship of the consumer. For eligible requests, pull messages from storage. **If the result of pulling the message is PULL_NOT_FOUND, it means that the current MessageQueue has no latest news. **At this time, a PullRequest object will be encapsulated and delivered to the pullRequestTable of the internal thread of PullRequestHoldService. **PullRequestHoldService thread will periodically poll the pullRequestTable, if there is a new message or the hold time exceeds the polling time, it will encapsulate the Response request and send it to the client. **In addition, the messageArrivingListener is defined in the DefaultMessageStore. When a new ConsumeQueue record is generated, the messageArrivingListener callback will be triggered, and the latest message will be returned to the client immediately. The persistent connection mechanism makes RocketMQ's network utilization very efficient and minimizes the waiting overhead when pulling messages. Achieved millisecond-level message delivery.

Other performance optimization methods of RocketMQ

close bias lock

In the performance test of RocketMQ, it is found that there are a lot of RevokeBias pauses. The biased lock is mainly to eliminate the synchronization primitives in the case of no competition to improve performance. However, considering that there are relatively few such scenarios in RocketMQ, the biased lock feature is turned off through -XX:-UseBiasedLocking.

In the absence of actual competition, it can continue to optimize for some scenarios. If not only is there no actual competition, but there is only one thread using the lock from beginning to end, then maintaining lightweight locks is wasteful. The goal of biased locks is to reduce the performance consumption of using lightweight locks when there is no competition and only one thread uses the lock. Lightweight locks require at least one CAS every time they apply for and release a lock, but biased locks only need one CAS when they are initialized.

The usage scenarios of biased locks have limitations, and are only applicable to scenarios where a single thread uses locks. If other threads compete, biased locks will expand into lightweight locks. When there is a small pause caused by a large number of RevokeBias, it means that the biased lock is of little significance. At this time, it is optimized by -XX:-UseBiasedLocking, so the JVM parameters of RocketMQ will add -XX:-UseBiasedLocking by default.

write at the end

Finally, a comparison of the delay performance of Ali middleware is attached. RocketMQ still has a leading position in low latency. As shown in the figure below, RocketMQ has only a small amount of glitch delay of 10~50ms, while Kafka has many glitches of 500~1000ms.

Guess you like

Origin blog.csdn.net/Wis57/article/details/131716460