Two Phase Commitment Protocol (Two Phase Commitment Protocol)

One, a typical distributed transaction example

The problem of inter-bank transfer is a typical distributed transaction. A transfer of 1000 from user A to B requires A's balance -1000 and B's balance +1000. Obviously, the transactional nature of these two operations must be ensured.
Similarly, in the e-commerce system, when a user places an order, in addition to inserting a note in the order table, the inventory must be updated in the product table, etc., especially with the popularity of microservice architecture, the scene of distributed transactions has changed. More generally.

2. What is a two-phase submission agreement?

The two-phase commit protocol (2PC) is usually used to ensure strong data consistency. There are two types of nodes in the two-phase commit protocol (2PC): coordinator nodes and data nodes (or coordinators and participants). Data nodes can It is said that it is the backup of data on multiple nodes, and the coordination of node users to coordinate and manage the data consistency of multiple data nodes in transaction operations;
2PC (Two Phase Commit) protocol is usually divided into two phases, the commit request phase (Commit Request) Phase) or Voting phase, and Commit Phase;
1. Commit Request Phase, where the coordinator sends a request to the participant, notifies the participant to submit or cancel the transaction, and the participant enters the voting During the process, each participant replies to the coordinator's own vote: agree (the transaction is successfully executed locally) or cancel (the transaction fails to execute locally).
2. Commit Phase , the coordinator votes on the voting results of the participants in the previous phase, and submits the transaction when all the votes are "agree". If not, the transaction is aborted and the participant is notified, and the participant receives the notification Then perform the corresponding operation.
2PC (TWO Phase Commit) assumes that the node has not crashed, the networks of any two nodes are normally connected, and the data will not be lost during the log writing process.

Three, two-phase submission protocol interaction composition description

The two-phase commit protocol is a distributed algorithm that coordinates all distributed atomic transaction participants and decides to commit or cancel (rollback).

1. Participants of the Agreement

In the two-phase commit protocol, the system generally includes two types of machines (or nodes): one type is coordinator (coordinator), usually there is only one in a system; the other type is transaction participants (participants, cohorts or workers), generally Contains multiple, which can be understood as the number of data copies in the data storage system. The protocol assumes that each node will record a write-ahead log and store it persistently, so that the log will not be lost even if the node fails. The protocol also assumes that the nodes will not have permanent failures and that any two nodes can communicate with each other.

2. Two-stage execution

1. The request phase (commit-request phase, or voting phase, voting phase)
In the request phase, the coordinator will notify the transaction participants to commit or cancel the transaction, and then enter the voting process.
During the voting process, participants will inform the coordinator of their own decision: agree (transaction participant's local job execution is successful) or cancel (local job execution failure).

2. Commit phase
In this phase, the coordinator will make a decision based on the voting results of the first phase: submit or cancel.
The coordinator will notify all participants to commit the transaction if and only if all participants agree to commit the transaction, otherwise the coordinator will notify all participants to cancel the transaction.
After receiving the message from the coordinator, the participant will perform the corresponding operation.

(3) Disadvantages of two-phase submission

1. Synchronous blocking problem. During the execution process, all participating nodes are transaction-blocking.
When participants occupy public resources, other third-party nodes have to be blocked from accessing public resources.

2. Single point of failure. Due to the importance of the coordinator, once the coordinator fails.
Participants will be blocked forever. Especially in the second stage, when the coordinator fails, all participants are still in the state of locking transaction resources and cannot continue to complete the transaction operation. (If the coordinator hangs up, you can re-elect a coordinator, but it cannot solve the problem that the participant is in a blocked state due to the downtime of the coordinator)

3. The data is inconsistent. In the second phase of the two-phase submission, when the coordinator sends a commit request to the participants, a local network abnormality occurs or the coordinator fails during the commit request, which causes only a part of the participants to accept the commit request.
After receiving the commit request, this part of the participants will execute the commit operation. However, other machines that have not received the commit request cannot perform transaction commit. As a result, the entire distributed system has the phenomenon of data consistency.

(4) Problems that cannot be solved by the two-phase submission

When the coordinator makes a mistake and the participant makes a mistake, the two-stage cannot guarantee the integrity of the transaction execution.
Consider that the coordinator went down after sending the commit message, and the only participant who received this message also went down at the same time.
Then even if the coordinator generates a new coordinator through the election agreement, the status of the transaction is uncertain, and no one knows whether the transaction has been committed.


Guess you like

Origin blog.51cto.com/15082402/2644336