MySQL update execution flow to redo log depth understanding

Earlier we analyzed a query execution process, and explains the modules involved in the implementation process. A query via a connector generally analyzer, optimizer, executor and other functional modules, and finally to the storage engine. Old iron can click on a link query in MySQL how to perform the pieces of content learning.

This time, we have to learn a thorough update statement in MySQL implementation process. That through this article we can fully understand what is Redo Log

Create a table structure

First, let's create a table, only the primary key ID, and a field of type int c.

create table T(ID int primary key, c int);

Now we want to update a data statement is as follows:

update T set c=c+1 where ID=2;

In fact, the update statement with the query process is similar, but more than redo log, undo log and binlog log.
MySQL

Previous query execution process we have said, a newer time, with this table related query cache will fail on a table, so this statement will result cache the entire T tables are empty. This is also the reason why we use the query cache is not recommended.

Books and accounting board

If you as a small supermarket owner, naturally there will be a transaction record books, but may have a credit record. Because the village has a girl named Xiaofang, looks beautiful and kind-hearted. Sometimes come to you white whore, the amount is not, it is credit. Your first record written in small pink board, such as the dead of night when the board put the powder data synchronization to archive its books. Of course, there are powder plate is full, so that when the plate is full powder reconciliation will write the books,

So, if people want to come on credit, or it is payback time, there are usually two approaches:

  1. Directly to the books turned out, this time to add to credit or deduction.
  2. In the first note of the powder board of accounts, and other after closing the books turned out accounting.

In business busy, we definitely choose the latter, because the former is too cumbersome operation. First, you have to find that person's total credit that record. Think about it, dozens of pages of dense, treasurer to find the name, but
then have to bring reading glasses can slowly find, and then find out the abacus calculation, then the final result is written back to the books. This time Xiaofang to credit, and so on for a long time. After about Xiaofang also how to grove it?

In MySQLalso have this problem, if each operation must be written to disk, the disk must then find the corresponding record before updating. IO cost of the entire process, query costs are high, in order to solve this problem, MySQLthe designers used a similar small supermarket boss pink plate ideas to improve update efficiency.

The powder board and books with the whole process, in fact, MySQL that often comes to the WALtechnology, WALthe full name is Write-Ahead Logging, it's the key point is to first write the log, write disk, which is the first to write powder board, so when busy again write books.

redo log

First, we must be clear that binlogthe log is in the server layer, and redo logis InnoDBunique.

When there is a record needs to be updated, InnoDBthe engine will write the first record redo log(pink sheet) and updates the memory, this time even completed. At the same time the engine will be at the appropriate time will update this record to disk, and system updates are often relatively idle time, that's what future proofing the treasurer to do.

Similar, InnoDB redo log of fixed size, such as can be configured as a set of four files, each file is 1GB, then this "pink sheet" will be a total of 4GB of the recording operation. Written from scratch, it is written to the end back to the beginning of the write cycle, as shown in the following diagram.

Here Insert Picture Description
write pos is the current record position, while the write side of the shift, write to No. 3 after the end of the file back to the beginning 0 files. checkpoint is to erase the current location, but also goes back and cycles, should be updated to record data files before erasing records.

And between the checkpoint write pos is the "pink sheet" empty portion also can be used to record a new operation. If you write pos catch checkpoint, he said that "pink sheet" full, this time can no longer perform a new update, had to stop to wipe some records, the checkpoint promote it.

With the redo log, InnoDB database can ensure that even abnormal restart, before submitting the record will be lost this ability is called Crash-Safe .

To understand this concept crash-safe, consider the example of our previous credit record. As long as credit records recorded in the powder board or written on the books, even after the treasurer forget, such as the sudden closure of a few days, after the resumption of business credit accounts can still clearly through books and pink plate data.

binlog

MySQL overall architecture of fact, there are two: one is Server layer, there is an engine layer, is responsible for storing relevant. Earlier we mentioned redo logthat InnoDBthe engine hold, and Server layer has its own journal, called binlog (archive log).

So why have two log it?

Since the beginning there has not MySQL InnoDB engine. MySQL's own engine is MyISAM, but
MyISAM can not afford to crash-safe (because the engine is Server layer and layer are two separate modules), binlog log can only be used for archiving. The introduction of MySQL InnoDB is another company in the form of plug-ins, since only rely on binlog is not crash-safe capability, so InnoDB uses another set of logging system - that is, redo log to achieve the crash-safe capability.

If only binlog, when the Server engine tier layer binlog log finished yet synchronized to the disk on the power. This time after the restart binlog record update operation, but the engine was not written to disk layer has led to inconsistent use the library from the binlog sync data.

redo log, binlog differences

  1. redo log InnoDB engine is unique; binlog is the MySQL Server layer implementation, all engines can be used.
  2. physical redo log log record is "data on a page change has been made"; the binlog logical log records the original logic of the sentence, such as "to field ID = c 2 of this line plus one" .
  3. redo log is written in circulation, fixed space will run out; binlog can append written. "Additional write" refers to the binlog file written to a certain size will switch to the next and does not overwrite the previous log.

Update statement execution process

With the concept of two log understanding, we can continue to understand the internal processes of the actuator when the InnoDB engine performs the update statement.

  1. Server actuator layer removed to call the engine ID = 2 this line. ID is the primary key, tree search engine to find the direct use of the line. If ID = 2 data pages where this line already in memory, it is returned directly to the actuator; otherwise, need to start disk into memory, and back again.
  2. Actuator to get the data value + 1, to obtain a new line of data, then call this line storage engine to write a new data interface.
  3. InnoDB engine update these rows of data into memory while the log pages affected by this update records in the redo log, then log in prepare state, then tells the actuators is complete, ready to commit the transaction.
  4. Actuators generate binlog this operation, and the binlog written to disk.
  5. State actuators continue to call the transaction engine submission interface engine receives a request to put just written the redo log into submission (commit), the update is complete.

The last three steps look a bit "around", writes the redo log split into two steps:
PREPARE and commit, this is the "two-phase commit."

As shown in green for Actuators below, white represents InnoDB engine to perform:

Here Insert Picture Description

Two-phase commit

Why there must be "two-phase commit" mean? This is so logically consistent between two logs. To illustrate this problem, we have to talk about that question from the beginning of the article: how to make the database back to the state within two weeks in any one second?

Earlier we talked, binlog will record all logic operations, and is in the form of "additional write". If your DBA promised within two weeks can be restored, the backup system will save all recent weeks the
binlog, and the system will do the entire backup on a regular basis. Here's a "regular" depend on the importance of the system, can be prepared a day, a week can be prepared.

When you need to restore a specified seconds, such as noon day two in the afternoon I found once mistakenly deleted the table, need to retrieve the data, you can do this:

  1. First, find the most recent full backup, if you're lucky, might be a backup last night, this backup to recover from temporary repository;
  2. Then, starting from the point of time of the backup, binlog backed up sequentially taken out, put the heavy time before noon mistakenly deleted the table.

Because redo log and binlog are two separate logical, if not two-phase commit, or is asked to finish redo log write binlog, or use of turn order. We look at these two methods will be a problem. (Cause data inconsistency)

Using the previous example still do the update statement. Suppose the current row with ID = 2, the value of field c is 0, then assuming that the implementation process update statement after writing the first log, second log has not finished during the crash occurred, what happens then?

After the first to write redo log write binlog

If after written redo log in the engine, bin log did not finish, reboots, you can still recover the data according to redo log log, but binlog no record of this statement. So from the library by binlog sync data has led to not put this line of data synchronization over the loss of this transaction operation resulting in inconsistent data.

The first to write binlog write redo log

If written binlog collapse, due to the redo log has not been written, the transaction invalid after crash recovery, but there binlog record. According to the binlog log from the library will lead to a transaction multiple, inconsistent with the main library.

Simply put, redo log and submit binlog can be used to represent the state of affairs, and the two-phase commit is to make the two state consistent logic. (Knock on the blackboard the students)

Here Insert Picture Description

Published 28 original articles · won praise 2 · Views 1458

Guess you like

Origin blog.csdn.net/qq_14855971/article/details/103636808