When using messaging middleware, how to ensure messages are consumed only once?

Widely used message middleware, used to load shifting, decoupling, asynchronous processing. Asynchronous processing may be used in the most scenes, such as the current technology blog site, all using integral system, users published an article, you can earn points you want, in order to enhance the performance of the system, the user can add credit operations asynchronous processing, does not need to be placed in the synchronization process.

We can put a user ID, the package need to increase the integration into a message to the message delivery system, asynchronous processing plus integral operation, since it occurs between different servers, it is possible message delivery failure, the failure handling problems, thereby causing the user plus integral fail, there is a possibility that repeated the message delivered, the user is likely to repeat the plus points, regardless of that kind of situation, are not normal circumstances.

To avoid the above two cases, we need to try to ensure that messages are not lost and the message was consumed only once, this article aside specific messaging middleware, messaging system from the general level of talk about how to avoid these two cases .

1, to ensure that messages are not lost

A message from production to consumption this link, there are three places may cause message loss, are as follows:

  • Written message delivery failure message from the producer to the process queue.
  • Message in the message queue, persistent failure.
  • The exception message is consumed consumers process occurs.

1.1 In the production process of message delivery failure

News producer and messaging systems are generally deployed independently on different servers to communicate will be done through the network between the two servers, the network is not stable, jitter may occur, then the data may be lost. Network jitter will have the following two situations occur.

Lost messages in the message of the production process

Scenario 1: network jitter occurs during the message transmitted to the message system, the loss of data directly.
Scenario 2: The message has arrived messaging system, but when the message server system returns the information to producers, network jitter occurs, data is not necessarily true at this time of loss, it may just be the producers think data loss.

Lost production for the message when the message can be taken to re-enter the mechanism, when the program detects network anomalies, will once again deliver the message to the messaging system. But redelivery in Scenario 2 cases, it may cause duplication of data, how to solve this problem, it will be mentioned later.

1.2 persistent failure message queue

Messaging is the message persistence, it is generally stored messages to the local disk, of course, there are a few message-oriented middleware supports data persistence to the database, the message system performance might decline.

If you have some knowledge of words Redis persistence, you will find Redis when persistent data, not every new one immediately credited to your local disk, but the data will be written to the first operating system Page Cache when certain conditions are met, then Page Cache data in the disk into the brush, because it reduces the random disk I / O operations, we know that random I / O is very time-consuming, which would also improve system performance , messaging middleware is no exception, when persistent is this way.

In some extreme cases, it may result in loss of data Page Cache, such as a sudden power failure or abnormal reboot the machine. Page Cache to solve data loss problems, ways to deploy clusters can be used to try to ensure that data is not lost.

1.3 exist in the process of consumption message loss

Message in the consumption process also occurs lost, and the two cases before the probability of loss in the consumption process than the much larger. A message about the consumption process into three steps: pull message consumers, consumer processes the message, the message system to update the progress of consumption.

image description

The first step in the case when the pull message may occur abnormal network jitter, the second step in dealing with the message may be some abnormal operations, which led to the process and did not finish, if the first step, the second step of the exception , the system update notification message consumer progress, then this failure message will never be treated, naturally lost, in fact, did not finish our business.

To avoid lost messages in the consumer, the consumer can only update the progress after the message is received and processed, but in extreme cases, a message repeated consumption problem arises, such as a message after the process is completed, consumers dang machine, and then the consumer has not updated schedule, after the consumer restart, this message will still be consumed to.

2, how to ensure that the message was consumed only once

Message system can not guarantee that the message itself is consumed only once, because the consumer may repeat itself, the system starts pulling the downstream repeat, repeated are repeated message may cause failure retry brought repeated, resulting in the compensation logic, to ensure that the message is consumed only time can be achieved utilization of power .

And other powers is a mathematical concept, is repeatedly perform the same operations and perform one operation, the result ultimately get it is the same.

Conceptually idempotent can be seen, even if the message is executed multiple times will not affect the system, then how to ensure equal powers in the use of messaging system? Since producers and consumers are likely to produce duplicate messages, so both ends of producers and consumers in the assurance of power.

Guarantee of power producers, etc. , at the time of production of the message using the message generator snow algorithm to a global ID, ID has a maintenance message mapping in the system message, if the same ID already exists in the mapping table, it discards the message, Although the message is delivered twice, but actually save one, to avoid duplication of messages.

Selected with the power of producers and other messaging middleware who has a relationship, because the message system does not need to achieve our own under the vast number of cases, it is not very good and other powers of control, consumer and other power development of what we Key personnel direction control .

It can be done from both ends of the general consumer and business layers idempotent operations, depending on the requirements of our business.

In general level, the use of good news is generated globally unique ID generated after the message is processed successfully put into this global ID to the data, a message before, start with this global database query ID exists in the process, If it does exist, then directly to give the message.

Using this global unique ID is realized idempotence message, the following pseudo-code:

boolean isIDExisted = selectByID(ID); // 判断ID是否存在
if(isIDExisted) {
  return; //存在则直接返回
} else {
  process(message); //不存在,则处理消息
  saveID(ID);   //存储ID
}

However, in extreme cases, this approach will still go wrong, if the message after processing, not enough saved to the database, consumers downtime restart, and will get the message again after the restart, query execution, and the message consumption had not been, or will be performed twice consumption. Database transaction may be introduced to solve this problem, but will reduce system performance. If the message is repeated consumption is not particularly stringent requirements, then this is not directly introduced into the general scheme of affairs just fine, after all, this is a very small probability of things.

At the operational level, we will become more selectively, such as optimistic locking, pessimistic locking, memory de-emphasis ( https://github.com/RoaringBitmap/RoaringBitmap) and other methods .

We take optimistic locks, for example, such as we give a user plus points, because plus integral operation does not need to be placed in the main business, so you can use asynchronous messaging system to inform, to use optimistic locking, you need to add the points table a version number field. And at the time of production of the first inquiry message and version number of this account along with the message sent to the message system.

image description

Consumers get the message and version number, the version number when you bring integration to perform the update operation of SQL, like this:

update score set score = score + 20, version=version+1 where userId=1 and version=1;

After this news the consumer success, version becomes 2, so if there are duplicate version = 1 news again pulled to the consumer, SQL statements, and does not execute successfully, thus ensuring idempotency message.

To ensure that the message is consumed only once, we need to focus on the power of consumers this period, utilization to ensure that messages are consumed once.

Today, standing on a general level messaging middleware, chatted how to ensure that data is not lost and only be consumed once, I hope today's article to help you learn or work, if you believe the article of value, like a welcome point ,Thank you.

At last

Currently on the Internet has a lot of heavyweights messaging middleware related articles, any similarity, please forgive me up. The original is not easy, the code word is not easy, but also hope that we can support. If something incorrect in the text, but also look made, thank you.

Internet flathead brother

Guess you like

Origin www.cnblogs.com/jamaler/p/12467206.html