Detailed explanation of the relationship between binlog timestamp and exec_time.
Author: Li Xichao, DBA of Jiangsu Commercial Bank, responsible for database and middleware operation, maintenance and construction. Good at MySQL, Python, Oracle, and loves cycling, technology research and sharing.
Produced by the Aikeson open source community, original content may not be used without authorization. Please contact the editor and indicate the source for reprinting.
This article is about 2,000 words and is expected to take 8 minutes to read.
Overview
Recently, when a system was tested, it was discovered that there was a delay in master-slave synchronization, and the cause of the delay was confirmed through binlog. After using the mysqlbinlog command to parse it, I found that the information in it was "somewhat vague but not understandable".
For example, for the following binlog snippet:
# at 449880
#240430 18:38:49 server id 345 end_log_pos 449967 CRC32 0xb3e8a02a GTID last_committed=13 sequence_number=14 rbr_only=yes original_committed_timestamp=1714473533138376 immediate_commit_timestamp=1714473539246294 transaction_length=446792
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1714473533138376 (2024-04-30 18:38:53.138376 CST)
# immediate_commit_timestamp=1714473539246294 (2024-04-30 18:38:59.246294 CST)
/*!80001 SET @@session.original_commit_timestamp=1714473533138376*//*!*/;
/*!80014 SET @@session.original_server_version=80027*//*!*/;
/*!80014 SET @@session.immediate_server_version=80027*//*!*/;
SET @@SESSION.GTID_NEXT= 'c0ac4587-6046-11ee-9fa7-001c42c92a7b:44'/*!*/;
# at 449967
#240430 18:38:16 server id 345 end_log_pos 450039 CRC32 0x0c7cb74e Query thread_id=16 exec_time=37 error_code=0
SET TIMESTAMP=1714473496/*!*/;
BEGIN
/*!*/;
/*!*/;
# at 450039
#240430 18:38:16 server id 345 end_log_pos 450098 CRC32 0xf9a84808 Table_map: `testdb`.`tb3` mapped to number 110
# at 450098
#240430 18:38:16 server id 345 end_log_pos 458309 CRC32 0xad84e9b0 Write_rows: table id 110
...
# at 896439
#240430 18:38:46 server id 345 end_log_pos 896498 CRC32 0x5cd7cd3b Table_map: `testdb`.`tb3` mapped to number 110
# at 896498
#240430 18:38:46 server id 345 end_log_pos 896540 CRC32 0x21b77031 Write_rows: table id 110 flags: STMT_END_F
...
### INSERT INTO `testdb`.`tb3`
### SET
### @1=131060 /* INT meta=0 nullable=0 is_null=0 */
### @2='c' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */
# at 896540
#240430 18:38:49 server id 345 end_log_pos 896599 CRC32 0x6d6bf911 Table_map: `testdb`.`tb3` mapped to number 110
# at 896599
#240430 18:38:49 server id 345 end_log_pos 896641 CRC32 0xccd2fbb1 Write_rows: table id 110 flags: STMT_END_F
...
### INSERT INTO `testdb`.`tb3`
### SET
### @1=131061 /* INT meta=0 nullable=0 is_null=0 */
### @2='c' /* VARSTRING(80) meta=80 nullable=1 is_null=0 */
# at 896641
#240430 18:38:49 server id 345 end_log_pos 896672 CRC32 0xadb14b9d Xid = 85
From the above binlog, we can know (P1):
#240430 18:38:16 执行 begin 开启了事务 (为便于表述,将时间字段名为timestamp)
#240430 18:38:16 执行了 tb3的insert 操作
#240430 18:38:46 执行了 tb3的insert 操作
#240430 18:38:49 执行了 tb3的insert 操作
#240430 18:38:49 执行了commit操作
Additionally (P2):
original_commit_timestamp=2024-04-30 18:38:53
immediate_commit_timestamp=2024-04-30 18:38:59
exec_time=37
Regarding P2 information, the following questions are asked:
- Q1: What do the fields in P2 mean? How is it calculated?
- Q2: What is the relationship between the fields of P2 and the timestamp seen by P1?
- Q4: How is the timestamp in P1 obtained? Especially in a master-slave environment
To this end, through test verification and source code analysis, exec_time
the origin of common Event times and in binlog is analyzed, and the relationship between fields is summarized.
The following analysis is based on MySQL 8.0, and the fields may be different in different versions.
Master node binlog log
1. GTID Event
timestamp
For the main node: If there is no special instructions, the Event is to obtain the latest timestamp ( ) timestamp
at the initial position of each thread execution , and assign it to when the Event object is produced .dispatch_command()
thd->start_time
thd->start_time
Log_event::common_header->when
The main stack information is as follows:
|-handle_connection (./sql/conn_handler/connection_handler_per_thread.cc:302)
|-do_command (./sql/sql_parse.cc:1343)
|-dispatch_command (./sql/sql_parse.cc:1922)
// 设置 thd->start_time
|-thd->set_time()
|-my_micro_time_to_timeval(start_utime, &start_time)
|-dispatch_sql_command (./sql/sql_parse.cc:5135)
|-mysql_execute_command (./sql/sql_parse.cc:3518)
|-Sql_cmd_dml::execute (./sql/sql_select.cc:579)
……
|-Table_map_log_event the_event(this, table, table->s->table_map_id,is_transactional)
……
|-Rows_log_event *const ev = new RowsEventT(this, table, table->s->table_map_id, )
……
|-Xid_log_event end_evt(thd, xid)
immediate_commit_timestamp/original_commit_timestamp
immediate_commit_timestamp
The timestamp obtained is the submission time, and the master node original_commit_timestamp
is equal to immediate_commit_timestamp
.
|-error = trx_cache.flush(thd, &trx_bytes, wrote_xid)
|-Transaction_ctx *trn_ctx = thd->get_transaction()
|-trn_ctx->sequence_number = mysql_bin_log.m_dependency_tracker.step()
|-if (trn_ctx->last_committed == SEQ_UNINIT): trn_ctx->last_committed = trn_ctx->sequence_number - 1
|-if (!error): if ((error = mysql_bin_log.write_transaction(thd, this, &writer)))
|-int64 sequence_number, last_committed
|-m_dependency_tracker.get_dependency(thd, sequence_number, last_committed)
|-thd->get_transaction()->last_committed = SEQ_UNINIT
|-ulonglong immediate_commit_timestamp = my_micro_time()
//|-ulonglong original_commit_timestamp = thd->variables.original_commit_timestamp
|-ulonglong original_commit_timestamp = immediate_commit_timestamp
|-uint32_t trx_immediate_server_version = do_server_version_int(::server_version)
|-Gtid_log_event gtid_event(thd, cache_data->is_trx_cache(), last_committed, sequence_number,
cache_data->may_have_sbr_stmts(), original_commit_timestamp,
immediate_commit_timestamp, trx_original_server_version,
trx_immediate_server_version)
2. BEGIN Event
timestamp
Note: For the master node BEGIN event, timestamp
it is not the timestamp when BEGIN is executed, but the first modification operation. After completing the modification of the first row of data in the InnoDB layer, the Table_map event is generated and written. Before generating the Table_map event, if the binlog cache of the entire transaction is empty at this time, the operation will be obtained immediately thd->start_time
and the real BEGIN event will be generated.
exec_time
At the same time, for the master node, exec_time
it is obtained by obtaining the latest timestamp - BEGIN Event in the process of generating BEGIN Event timestamp
.
exec_time = A - B
- A: The time when the BEGIN Event is generated after executing the first modified SQL and completing the first row modification (write/update/delete) operation.
- B: The start execution time of the first modified SQL (thd->start_time)
The internal stack and execution sequence are as follows:
3. Table_map Event
4. Write Event
5. Xid Event
6. Summary of master node
- In addition to BEGIN Event,
timestamp
it is the start time of the first operation that needs to be written to the binlog (such as: write/update/delete); - For other Events,
timestamp
it is the start time when the SQL statement is executed; immediate_commit_timestamp/original_commit_timestamp
That is the timestamp when submitted;- exec_time = A - B
- A: The time when the BEGIN Event is generated after executing the first modified SQL and completing the first row modification (write/update/delete) operation.
- B: The start execution time of the first modified SQL (thd->start_time)
Slave node binlog log
1. GTID Event
timestamp
On the slave node: For GTID Event, MySQL will not obtain the timestamp of the GTID/XID Event of the master node when parsing the event, so it will "inherit" the timestamp of the previous operation of the transaction. The timestamps of all modification operations on the slave node come from the timestamp when the master node performs the operation. Therefore, the time of the GTID/XID Event of the slave node is the timestamp of the last modification operation of the master node.
immediate_commit_timestamp/original_commit_timestamp
immediate_commit_timestamp
Get the timestamp of the slave node's submission time. Obtained original_commit_timestamp
from GTID Event original_commit_timestamp
, that is, the main node submits the operation timestamp
.
The main stack information is as follows:
|-handle_slave_worker (./sql/rpl_replica.cc:5891)
|-slave_worker_exec_job_group (./sql/rpl_rli_pdb.cc:2549)
|-Slave_worker::slave_worker_exec_event (./sql/rpl_rli_pdb.cc:1760)
|-Xid_apply_log_event::do_apply_event_worker (./sql/log_event.cc:6179)
|-Xid_log_event::do_commit (./sql/log_event.cc:6084)
|-trans_commit (./sql/transaction.cc:246)
|-ha_commit_trans (./sql/handler.cc:1765)
|-MYSQL_BIN_LOG::commit (./sql/binlog.cc:8170)
|-MYSQL_BIN_LOG::ordered_commit (./sql/binlog.cc:8789)
|-MYSQL_BIN_LOG::process_flush_stage_queue (./sql/binlog.cc:8326)
|-MYSQL_BIN_LOG::flush_thread_caches (./sql/binlog.cc:8218)
|-binlog_cache_mngr::flush (./sql/binlog.cc:1099)
|-binlog_cache_data::flush (./sql/binlog.cc:2098)
|-MYSQL_BIN_LOG::write_transaction (./sql/binlog.cc:1586)
// 生成并写入 GTID event
|-ulonglong immediate_commit_timestamp = my_micro_time()
|-if (original_commit_timestamp == UNDEFINED_COMMIT_TIMESTAMP){...}
|-Gtid_log_event gtid_event(thd, cache_data->is_trx_cache(), last_committed, sequence_number,
cache_data->may_have_sbr_stmts(), original_commit_timestamp, immediate_commit_timestamp, trx_original_server_version,
trx_immediate_server_version)
official
immediate_commit_timestamp - original_commit_timestamp = A + B + C
- A = The time it takes for the master node to transfer binlog to the slave node
- B = The time it takes to replay the binlog from the slave node
- C = synchronization delay/interruption time
2. BEGIN Event
timestamp
Here timestamp
comes from the BEGIN Event of the main node timestamp
. When it is actually executed, the BEGIN Event will be obtained timestamp
and assigned to thd->start_time/thd->user_time
. When generating an Event object from a node, just continue to thd->start_time
get the timestamp from .
exec_time
Then, the slave node exec_time
still obtains the latest timestamp in the process of generating BEGIN Event timestamp
(note that timestamp
the start execution time of the modified SQL from the master node) is obtained.
The main stack information is as follows:
|-handle_slave_worker (./sql/rpl_replica.cc:5891)
|-slave_worker_exec_job_group (./sql/rpl_rli_pdb.cc:2549)
|-Slave_worker::slave_worker_exec_event (./sql/rpl_rli_pdb.cc:1760)
|-Log_event::do_apply_event_worker (./sql/log_event.cc:1083)
|-Query_log_event::do_apply_event (./sql/log_event.cc:4443)
|-Query_log_event::do_apply_event (./sql/log_event.cc:4606)
// 设置 user_time=start_time=ev.common_header->when
|-thd->set_time(&(common_header->when))
// query_arg="BEGIN"
|-thd->set_query(query_arg, q_len_arg)
...
official
exec_time = A + B + C + D
- A = master node, the entire transaction time
- B = binlog transmission time
- C = synchronization delay/interruption time (probably - major)
- D = Complete the first row of data modification from the node
original_commit_timestamp - timestamp = of begin event indicates the actual time consumption of the entire transaction on the master node ([Main-first modification] to [Main-commit start]).
3. Table_map Event
4. Write Event
5. Xid Event
6. From node section
- Except for GTID/XID Event, the timestamps of other events come from the events of the master node;
- The GTID/XID Event is
timestamp
the start time of the last modification operation of the master node; - The GTID Event
original_commit_timestamp
comes from the master node andimmediate_commit_timestamp
is the latest timestamp; - exec_time = A - B
- A = The latest timestamp of the BEGIN Event generated from the node
- B = Master node starts time to execute the first DML operation
Conclusion
At this point, the timestamps and information in the binlog exec_time
have been basically sorted out. Interested friends can go back to the beginning of the article and see if there are answers to Q1-Q3.
Finally, it is recommended that readers simulate several cases in order to have a deeper understanding of the relevant fields, so that they can be more comfortable when using binlog to analyze master-slave synchronization problems.
The above information is for communication only. The author's level is limited. If there are any shortcomings, please feel free to communicate in the comment area.
For more technical articles, please visit: https://opensource.actionsky.com/
About SQLE
SQLE is a comprehensive SQL quality management platform that covers SQL auditing and management from development to production environments. It supports mainstream open source, commercial, and domestic databases, provides process automation capabilities for development and operation and maintenance, improves online efficiency, and improves data quality.
SQLE get
type | address |
---|---|
Repository | https://github.com/actiontech/sqle |
document | https://actiontech.github.io/sqle-docs/ |
release news | https://github.com/actiontech/sqle/releases |
Data audit plug-in development documentation | https://actiontech.github.io/sqle-docs/docs/dev-manual/plugins/howtouse |