This article is shared from Huawei Cloud Community " MySQL full-text index source code analysis: Insert statement execution process ", author: GaussDB database.
1. Background introduction
Full-text indexing is a commonly used technical means in the field of information retrieval. It is used for full-text search problems, that is, searching for documents containing the word based on the word. For example, if you enter a keyword in the browser, the search engine needs to find all related documents. And sorted by relevance.
The underlying implementation of full-text index is based on inverted index. The so-called inverted index describes the mapping relationship between words and documents, expressed in the form of (word, (the document where the word is located, the offset of the word in the document)). The following example will show how the full-text index is organized:
mysql> CREATE TABLE opening_lines ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, opening_line TEXT(500), author VARCHAR(200), title VARCHAR(200), FULL TEXT idx (opening_line) ) ENGINE=InnoDB; mysql> INSERT INTO opening_lines(opening_line,author,title) VALUES ('Call me Ishmael.','Herman Melville','Moby-Dick'), ('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'), ('I am an invisible man.','Ralph Ellison','Invisible Man'), ('Where now? Who now? When now?','Samuel Beckett','The Unnamable'); mysql> SET GLOBAL innodb_ft_aux_table='test/opening_lines'; mysql> select * from information_schema.INNODB_FT_INDEX_TABLE; +-----------+--------------+-------------+-----------+--------+----------+ | WORD | FIRST_DOC_ID | LAST_DOC_ID | DOC_COUNT | DOC_ID | POSITION | +-----------+--------------+-------------+-----------+--------+----------+ | across | 4 | 4 | 1 | 4 | 18 | | call | 3 | 3 | 1 | 3 | 0 | | comes | 4 | 4 | 1 | 4 | 12 | | invisible | 5 | 5 | 1 | 5 | 8 | | ishmael | 3 | 3 | 1 | 3 | 8 | | man | 5 | 5 | 1 | 5 | 18 | | now | 6 | 6 | 1 | 6 | 6 | | now | 6 | 6 | 1 | 6 | 9 | | now | 6 | 6 | 1 | 6 | 10 | | screaming | 4 | 4 | 1 | 4 | 2 | | sky | 4 | 4 | 1 | 4 | 29 | +-----------+--------------+-------------+-----------+--------+----------+
As above, a table is created and a full-text index is established on the opening_line column. Take inserting 'Call me Ishmael.' as an example. 'Call me Ishmael.' is a document with an ID of 3. When building a full-text index, the document will be divided into 3 words 'call', 'me', 'ishmael ', because 'me' is smaller than the set minimum word length of ft_min_word_len(4) and is discarded. In the end, only 'call' and 'ishmael' will be recorded in the full-text index, where the starting position of 'call' is the 0th character in the document. , the offset is 0, the starting position of 'ishmael' is the 12th character in the document, and the offset is 12.
For a more detailed introduction to the functions of the full-text index, please refer to the MySQL 8.0 Reference Manual. This article will briefly analyze the execution process of the Insert statement from the source code level.
2. Full-text index Cache
What is recorded in the full-text index table is {word, {document ID, occurrence position}}, that is, to insert a document, it needs to be segmented into multiple {words, {document ID, occurrence position}} such a structure, if every time If the disk is flushed immediately after word segmentation, the performance will be very poor.
In order to alleviate this problem, Innodb introduced a full-text index cache, which functions similarly to Change Buffer. Each time a document is inserted, the word segmentation results are first cached in the cache, and then flushed to the disk in batches when the cache is full, thereby avoiding frequent disk flushing. Innodb defines the fts_cache_t structure to manage cache, as shown in the following figure:
Each table maintains a cache, and for each table for which a full-text index is created, an fts_cache_t object is created in memory. Note that fts_cache_t is a table-level cache. If multiple full-text indexes are created for a table, there will still be a corresponding fts_cache_t object in the memory. Some important members of fts_cache_t are as follows:
-
optimize_lock, deleted_lock, doc_id_lock: mutex locks, related to concurrent operations.
-
deleted_doc_ids: vector type, stores deleted doc_ids.
-
indexes: vector type, each element represents a full-text index. Each time a full-text index is created, an element will be added to the array. The word segmentation results of each index are stored in a red-black tree structure. The key is word, and the value is doc_id and The offset of the word.
-
total_size: All memory allocated by the cache, including the memory used by its substructures.
3. Insert statement execution process
Taking the MySQL 8.0.22 source code as an example, the execution of the Insert statement is mainly divided into three stages, namely the row record writing stage, the transaction submission stage and the dirty cleaning stage.
3.1 Writing row record phase
The main workflow for writing row records is shown in the figure below:
As shown in the figure above, the most important thing in this stage is to generate doc_id, write it into the Innodb row record, and cache the doc_id so that the text content can be obtained based on the doc_id during the transaction submission phase. The function call stack is as follows:
ha_innobase::write_row ->row_insert_for_mysql ->row_insert_for_mysql_using_ins_graph ->row_mysql_convert_row_to_innobase ->fts_create_doc_id ->fts_get_next_doc_id ->fts_trx_add_op ->fts_trx_table_add_op
fts_get_next_doc_id and fts_trx_table_add_op are two important functions. fts_get_next_doc_id is to obtain doc_id. Innodb row records contain some hidden columns, such as row_id, trx_id, etc. If a full-text index is created, a hidden field FTS_DOC_ID will also be added to the row records. , this value is obtained in fts_get_next_doc_id, as follows:
And fts_trx_add_op adds the full-text index operation to trx, and will be further processed when the transaction is committed.
3.2 Transaction submission phase
The main workflow of the transaction submission phase is shown in the figure below:
This stage is the most important step in the entire FTS insertion. Word segmentation of the document, obtaining {word, {document ID, occurrence position}}, and inserting into the cache are all completed at this stage. Its function call stack is as follows:
fts_commit_table ->fts_add ->fts_add_doc_by_id ->fts_cache_add_doc // Get the document based on doc_id and segment the document into words ->fts_fetch_doc_from_rec // Add the word segmentation results to the cache ->fts_cache_add_doc ->fts_optimize_request_sync_table //Create FTS_MSG_SYNC_TABLE message to notify the dirty thread to clean the dirty thread ->fts_optimize_create_msg(FTS_MSG_SYNC_TABLE)
Among them, fts_add_doc_by_id is a key function. This function mainly accomplishes the following things:
1) Find the row record based on doc_id and obtain the corresponding document;
3) Determine whether cache->total_size reaches the threshold, if it reaches the threshold , then add an FTS_MSG_SYNC_TABLE message to the message queue of the dirty thread to notify the thread to flush (fts_optimize_create_msg). The specific code is as follows:
To facilitate understanding, I have omitted the exception handling part of the code and some common parts of finding records, and given brief comments:
static ulint fts_add_doc_by_id(fts_trx_table_t *ftt, doc_id_t doc_id) { /* 1. Search records in fts_doc_id_index index based on docid*/ /* btr_cur_search_to_nth_level, btr_cur_search_to_nth_level will be called in the btr_pcur_open_with_no_init function The b+ tree search record process will be performed. First, the leaf node where the docid record is located is found from the root node, and then the docid record is found through binary search. */ btr_pcur_open_with_no_init(fts_id_index, tuple, PAGE_CUR_LE, BTR_SEARCH_LEAF, &pcur, 0, &mtr); if (btr_pcur_get_low_match(&pcur) == 1) { /* If docid record is found*/ if (is_id_cluster) { /** 1.1 If fts_doc_id_index is a clustered index, it means that the row record data has been found, and the row record is saved directly**/ doc_pcur = &pcur; } else { /** 1.2 If fts_doc_id_index is a secondary index, you need to further search the row record on the clustered index based on the primary key id found in 1.1, and save the row record after finding it **/ btr_pcur_open_with_no_init(clust_index, clust_ref, PAGE_CUR_LE, BTR_SEARCH_LEAF, &clust_pcur, 0, &mtr); doc_pcur = &clust_pcur; } // Traverse cache->get_docs for (ulint i = 0; i < num_idx; ++i) { /***** 2. Perform word segmentation on the document, obtain the {word, (the document where the word is located, the offset of the word in the document)} associated pair, and add it to the cache*****/ fts_doc_t doc; fts_doc_init(&doc); /** 2.1 Obtain the content document of the corresponding column of the full-text index in the row record according to the doc_id, and parse the document, mainly to build the tokens of the fts_doc_t structure. The tokens are a red-black tree structure. Each element is a {word, [the The structure of the position where the word appears in the document]}, and the parsing results are stored in the doc**/ fts_fetch_doc_from_rec(ftt->fts_trx->trx, get_doc, clust_index,doc_pcur, offsets, &doc); /** 2.2 Add the {word, [the position where the word appears in the document]} obtained in step 2.1 to index_cache**/ fts_cache_add_doc(table->fts->cache, get_doc->index_cache, doc_id, doc.tokens); /***** 3. Determine whether cache->total_size reaches the threshold. If the threshold is reached, add an FTS_MSG_SYNC_TABLE message to the message queue of the dirty thread to notify the thread to clean *****/ bool need_sync = false; if ((cache->total_size - cache->total_size_before_sync > fts_max_cache_size / 10 || fts_need_sync) &&!cache->sync->in_progress) { /** 3.1 Determine whether the threshold is reached**/ need_sync = true; cache->total_size_before_sync = cache->total_size; } if (need_sync) { /** 3.2 Package the FTS_MSG_SYNC_TABLE message and mount it to the fts_optimize_wq queue, and notify the fts_optimize_thread thread to clean it dirty. The content of the message is table id **/ fts_optimize_request_sync_table(table); } } } }
After understanding the above process, you can explain the special phenomenon of full-text index transaction submission described on the official website. Refer to the InnoDB Full-Text Index Transaction Handling section of the MySQL 8.0 Reference Manual. If you insert some row records into the full-text index table, if the current transaction Uncommitted, we cannot find the inserted row record through the full-text index in the current transaction. The reason is that the update of the full-text index is completed when the transaction is committed. When the transaction is not committed, fts_add_doc_by_id has not yet been executed. Therefore, the record cannot be found through the full-text index. However, from Section 3.1, we can know that the Innodb row record has been inserted at this time. If you query through the full-text index, you can find the record by directly executing SELECT COUNT(*) FROM opening_lines.
The main workflow of the cleaning phase is shown in the figure below:
When InnoDB starts, a background thread will be created, the thread function is fts_optimize_thread , and the work queue is fts_optimize_wq. In the transaction submission phase of Section 3.2, when the cache is full, the fts_optimize_request_sync_table function will add an FTS_MSG_SYNC_TABLE message to the fts_optimize_wq queue. The background thread will remove the message and flush the cache to the disk. Its function call stack is as follows:
fts_optimize_thread ->ib_wqueue_timedwait ->fts_optimize_sync_table ->fts_sync_table ->fts_sync ->fts_sync_commit ->fts_cache_clear
The main operations performed by this thread are as follows:
-
Get a message from the fts_optimize_wq queue;
-
Determine the type of message. If it is FTS_MSG_SYNC_TABLE, perform flushing;
-
Flush the contents of the cache to the auxiliary table on disk;
-
Clear the cache and set the cache to its initial state;
-
Return to step 1 and get the next message;
In Section 3.2, when the transaction is submitted, if the total_size of the fts cache is greater than the set memory size threshold, an FTS_MSG_SYNC_TABLE will be written and inserted into the fts_optimize_wq queue. The dirty thread will process the message and flush the data in the fts cache. to disk and then clear the cache.
It is worth mentioning that when the total_size of fts cache is greater than the set memory size threshold, only one message will be written to the fts_optimize_wq queue. At this time, the fts cache can still write data and memory before it is processed by the background flush thread. will continue to increase, which is also the root cause of the OOM problem that causes concurrent insertion of full-text indexes. The fix for the problem is patch Bug #32831765 SERVER HITS OOM CONDITION WHEN LOADING TWO INNODB . Interested readers can check it out by themselves.
OOM check link: https://bugs.mysql.com/bug.php?id=103523
If the dirty thread has not yet dirty the fts cache of a certain table, the MySQL process will crash and the data in the cache will be lost. After restarting, the first time insert or select is executed on the table, the data in the cache before the crash will be restored in the fts_init_index function. At this time, the synced_doc_id that has been dropped to the disk will be read from the config table, and the synced_doc_id in the table will be larger than synced_doc_id. The records are read and word segmented and restored to the cache. For specific implementation, please refer to the fts_doc_fetch_by_doc_id and fts_init_recover_doc functions.
Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~
The pirated resources of "Celebrating More Than Years 2" were uploaded to npm, causing npmmirror to have to suspend the unpkg service. Microsoft's China AI team collectively packed up and went to the United States, involving hundreds of people. The founder of the first front-end visualization library and Baidu's well-known open source project ECharts - "going to the sea" to support Fish scammers used TeamViewer to transfer 3.98 million! What should remote desktop vendors do? Zhou Hongyi: There is not much time left for Google. It is recommended that all products be open source. A former employee of a well-known open source company broke the news: After being challenged by his subordinates, the technical leader became furious and fired the pregnant female employee. Google showed how to run ChromeOS in an Android virtual machine. Please give me some advice. , what role does time.sleep(6) here play? Microsoft responds to rumors that China's AI team is "packing for the United States" People's Daily Online comments on office software's matryoshka-like charging: Only by actively solving "sets" can we have a future