Big lie! Is new index better than Explain in SQL optimization? Interviewer: Go out

Bragging

A few days ago, the boss asked me how to optimize SQL, and I answered to create a new index . Hahaha, then the boss went out to look for a stick. After he came in, he told me that you know where the door is. Go out by yourself or do I invite you out?

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

Then he was forced to go out and be beaten. After returning, the boss said to see what Explain was, and then submitted a 5,000-word review. . . . . .

The following content is described with MySQL 8.0

2. Basic content

Since you want to optimize sql, the new index is really right, but you can create an index when you can't see a field, so it is really easy to be beaten, let's talk about the operation of the new index in the end mysql: First, the index model of InnoDB is B+ Trees , in InnoDB, tables are stored in the form of indexes according to the order of the primary key. Tables with this storage method are called index-organized tables. And every time a new index is created, it corresponds to a B+ tree in InnoDB. Suppose, we have a table whose primary key column is id, the table has fields k, name , and an index on k.

create table T(
id int primary key, 
k int not null, 
name varchar(16),
index (k))engine=InnoDB;
复制代码

Insert three pieces of data into the table:

INSERT INTO T (id, k) VALUES (100, 1);
INSERT INTO T (id, k) VALUES (200, 2);
INSERT INTO T (id, k) VALUES (300, 3);
INSERT INTO T (id, k) VALUES (500, 5);
INSERT INTO T (id, k) VALUES (600, 6);

An example schematic diagram of two trees is as follows:

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

As can be seen from the figure, every time a new index is created, a new B+ tree is added, and the index is divided into a primary key index and a non-primary key index: Primary key index: The leaf node of the primary key index stores the entire row of data. In InnoDB, the primary key index is also called a clustered index (clustered index). Non-primary key index: The content of the leaf node of a non-primary key index is the value of the primary key. In InnoDB, non-primary key indexes are also called secondary indexes.

Then execute a query statement:

select * from T where id=500;

This statement only needs to search the B+ tree of id, and then all the data of the leaf node will be returned.

The next query statement:

select * from T where k=5

This statement first needs to go to the B+ tree of index k, and then find that the primary key id corresponding to k=5 is 500, and then go to the id index tree to query through this primary key id and then return the data. This back and forth process is called back table.

In other words, queries based on non-primary key indexes need to scan one more index tree. Therefore, we should try to use primary key queries in our applications.

Of course, I also know that in actual development, the primary key query is rarely used, because the primary key generally does not exist in the business process, then look at the following statement:

select id from T where k=5

The biggest difference between this statement and the above statement is that the above statement queries all fields, while the following statement queries only id, and the leaf node of the k field index tree saves the id value, which can be returned directly without returning to the table. After querying the id index tree, this is called a covering index.

  • Since a covering index can reduce the number of tree searches and significantly improve query performance, using a covering index is a common performance optimization method.

Based on the description of the coverage index above, let's look at an example:

CREATE TABLE `tuser` (
  `id` int(11) NOT NULL,
  `id_card` varchar(32) DEFAULT NULL,
  `name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `ismale` tinyint(1) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `id_card` (`id_card`),
  KEY `name_age` (`name`,`age`)
) ENGINE=InnoDB

You can see in the table building statement that there is a primary key index id, two secondary indexes id_card and name_age, and name_age is an index composed of two fields (or multiple fields), called a composite index. Suppose you want to execute the following statement now:

select id from tuser where name like ‘张%’;

Although the above query statement does not add a separate index in the name field, it has a composite index, so the above statement can also use the index:

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

As you can see, the index items are sorted according to the order of the fields appearing in the index definition. The above statement can use this combined index to find the first qualified record is ID3, and then traverse backwards until the conditions are not met. until.

But if the composite index is defined as name_age (age, name), the above query statement will not use this composite index, which is called the leftmost prefix principle.

  • There cannot be too many single-table indexes. There is an unwritten rule in the industry that there are no more than 20 single-table fields and no more than 5 indexes, because as the amount of data increases, too many indexes will occupy a lot of physical space.

Of course, these are basic sql. When encountering some more complex sql statements, how to optimize them? Explain execution plan is needed. Let’s take a look at an example first (this sql statement is very complicated and is used by the company now Arrived, so make a code, sorry ha):

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

The focus is here:

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

Through the Explain keyword, you can see the number of rows scanned by each query, which index is used, etc. The following article will talk about the meaning of each field, and then see how to optimize. . . . . .

3. Detailed explanation of Explain execution plan

Take the above picture as an example, Explain mainly uses the following fields to display how the optimizer's expectations match the actual execution time and other information based on the iterator:

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

id

id is the order of execution, that is, the execution priority of each statement. It may be the same (in this case, it is determined by the optimizer) or different (the larger the id value, the higher the priority, the earlier it will be executed).

select_type

Indicates the type of select query, which is mainly used to distinguish various complex queries, such as ordinary queries, joint queries, sub-queries, etc. The main values ​​are as follows:

SIMPLE : Represents the simplest select query statement, that is, the query does not include operations such as sub-queries or unions.

PRIMARY : When the query contains any complex sub-parts, the outermost query is marked as PRIMARY.

SUBQUERY : When a subquery is included in the select or where list, the subquery is marked as SUBQUERY.

DERIVED : Indicates the select of the sub-query contained in the from clause, and the sub-query contained in the from list will be marked as derived.

UNION : If the select statement appears after the union, it will be marked as union; if the union is included in the subquery of the from clause, the outer select will be marked as derived.

UNION RESULT : Represents reading data from the temporary table of the union, and the <union1,4> of the table column indicates that the union operation is performed with the results of the first and fourth selects.

table

The name of the queried table is not necessarily a real table. If an alias is assigned to the table, the alias is displayed, or it may be a temporary table. (Just a table name, just make a note)

partitions

The partition information matched during the query is NULL for non-partitioned tables. When the query is a partitioned table, partitions displays the partitions in the partition table name.

type

What type of query is used is a very important indicator in SQL optimization. The following performances from good to bad are: system> const> eq_ref> ref> ref_or_null> index_merge> unique_subquery> index_subquery> range> index> ALL

system : When the table has only one row of records (system table), the amount of data is small, and disk IO is often not required, and the speed is very fast.

const: indicates that the primary key or unique index is hit during the query, or the connected part is a constant (const) value. This type of scanning efficiency is extremely high, the amount of returned data is small, and the speed is very fast.

eq_ref : Hit the primary key or unique key index when querying.

ref : Different from eq_ref, ref means that if a non-unique index is used, many qualified rows will be found.

ref_or_null : This connection type is similar to ref, except that MySQL will additionally search for rows containing NULL values.

index_merge : The index merge optimization method is used, and the query uses more than two indexes.

unique_subquery : The index search function used to completely replace the subquery is more efficient.

index_subquery : Different from unique_subquery, it is used for non-unique index and can return duplicate values.

range : Use the index to select rows, and only retrieve rows within a given range. Simply put, it is to retrieve data in a given range for an indexed field. Use between...and, <, >, <=, in and other conditions in the where statement to query the types are all range.

index : Index and ALL are actually reading the entire table. The difference is that index is read through the index tree, while ALL is read from the hard disk.

ALL : Will traverse the entire table to find matching rows, with the worst performance.

possible_keys

Indicates which indexes are used in MySQL to allow us to find the desired records in the table. Once there is an index on a field involved in the query, the index will be listed, but this index is not necessarily the final query data The index being used. ((Sometimes, although you have created an index for the corresponding field, the optimizer does not necessarily execute according to this index. At this time, you need to specify it with force index))

key

Different from possible_keys, key is the index actually used in the query. If the index is not used, it is displayed as NULL.

key_len

key_len: Indicates the index length (number of bytes) used in the query. In principle, the shorter the length, the better.

A single-column index counts the entire index length. For a multi-column index, not all columns can be used. You need to calculate the columns actually used in the query.

ref

ref: Common ones are: const, func, null, field name.

When using constant equivalent query, const is displayed. When the query is related, the related fields of the corresponding related table will be displayed. If the query condition uses expressions, functions, or the condition column is internally implicitly converted, it may be displayed as func. All other cases are displayed as null.

rows

rows: Estimate the number of rows that need to be read to find the records we need based on table statistics and index usage.

This is a relatively important data for evaluating SQL performance. The number of rows that mysql needs to scan can intuitively display the performance of SQL. In general, the smaller the rows value, the better.

filtered

Filtered is a percentage value. To put it simply, this field indicates the proportion of the number of records that meet the conditions after filtering the data returned by the storage engine. (After MySQL.5.7, explain directly displays partitions and filtered information by default).

Extra

This column will display a lot of information that is not suitable for display in other columns. A lot of extra information in Explain will be displayed in this field:

  • Using index: We use a covering index in the corresponding select operation. In layman's terms, the query column is covered by the index. When using the covering index, the query speed will be very fast, which is an ideal state in SQl optimization.
EXPLAIN SELECT id FROM s_goods;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • Using where: When querying, no available index is found, and the required data is filtered through the where condition, but it should be noted that not all queries with a where statement will display the Using where.
EXPLAIN SELECT * FROM s_goods WHERE cat_id = 1;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • Using temporary: Indicates that the result of the query needs to be stored in a temporary table, which is generally used in sorting or grouping queries.
EXPLAIN SELECT shop_id FROM s_goods WHERE cat_id IN (1,2,3,4) GROUP BY shop_id;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • Using filesort: indicates the sorting operation that cannot be done using the index, that is, the ORDER BY field is not indexed, and usually such SQL needs to be optimized.
EXPLAIN SELECT * FROM s_goods ORDER BY shop_id;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • Using join buffer: If the associated field does not use an index, this will be displayed.
EXPLAIN SELECT * FROM s_goods a LEFT JOIN s_goods_cats b ON a.shop_id = b.parent_id;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • Impossible where: It means that we used the incorrect where statement, resulting in no rows that meet the conditions.
EXPLAIN SELECT * FROM s_goods WHERE 1 <> 1;

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

  • No tables used: There is no FROM clause in our query statement, or there is a FROM DUAL clause.
EXPLAIN SELECT NOW();

Big lie!  Is new index better than Explain in SQL optimization?  Interviewer: Go out

 

Author: Teenager gets up the

Link: https://juejin.im/post/5ee894a05188253683551567

Guess you like

Origin blog.csdn.net/qq_45401061/article/details/108631150