MySQL index optimization (two)
Small table drives big table
ORDER BY optimization
Simulation data
create table tblA(
#id int primary key not null auto_increment,
age int,
birth timestamp not null
);
insert into tblA(age, birth) values(22, now());
insert into tblA(age, birth) values(23, now());
insert into tblA(age, birth) values(24, now());
create index idx_A_ageBirth on tblA(age, birth);
View execution plan
Case A
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY age
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY age,birth
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY birth
EXPLAIN SELECT * FROM tblA WHERE age > 20 ORDER BY birth,age
EXPLAIN SELECT * FROM tblA WHERE birth > '2021-02-19 22:45:00' ORDER BY birth
EXPLAIN SELECT * FROM tblA WHERE birth > '2021-02-19 22:45:00' ORDER BY age
Case B
EXPLAIN SELECT * FROM tblA ORDER BY age ASC,birth DESC
in conclusion
- MySQL supports two ways of sorting, index and filesort. Index is highly efficient, it means that the scan index itself is sorted, and filesort is inefficient.
- ORDER BY clause, try to use Index to sort, avoid filesort to sort.
- The ORDER BY clause satisfies two conditions, and index sorting will be used. One is that the ORDER BY clause adopts the best left prefix rule; the other is that the where condition field and the ORDER BY clause are combined to meet the best left prefix rule.
Sorting and grouping optimization
MySQL sorting algorithm
When Using filesort occurs, MySQL will sort the query results according to its own algorithm
Two-way sort
- Before MySQL 4.1, two-way sorting was used, which literally means to scan the disk twice, and finally get the data, read the row pointer and the order by column, sort them, and then scan the sorted list, and re-start the data according to the values in the list. Read the corresponding data output in the list.
- Get the sort field from the disk, sort it in the buffer, and then fetch other fields from the disk. In
simple terms, to fetch a batch of data, the disk must be scanned twice. As we all know, I\O is very time-consuming, so in mysql4. After 1, a second improved algorithm appeared, which is one-way sorting
Single way sort
- Read all the columns required by the query from the disk, sort them in the buffer according to the order by column, and then scan the sorted list for output. It is faster and avoids reading the data a second time. And it turns random IO into sequential IO, but it uses more space because it saves each row in memory
The problem
- In sort_buffer, method B takes up a lot more space than method A, because method B takes out all the fields, so it is possible that the total size of the extracted data exceeds the capacity of sort_buffer, resulting in only the size of the sort_buffer data each time. , Perform sorting (create tmp file, multi-channel merging), take the sort_buffer capacity after sorting, and sort again... thus multiple I/Os. That is to say, I wanted to save one I/O operation, but it caused a large number of I/O operations, which was not worth the gain
How to optimize
-
Increase the
sort_buffer_size
parameter settingNo matter which algorithm is used, improving this parameter will improve efficiency. Of course, it must be improved according to the ability of the system, because this parameter is adjusted between 1M-8M for each process
-
Increase the
max_length_for_sort_data
parameter settingThe premise of mysql using single-way sorting is that the size of the sorted field is smaller than max_length_for_sort_data . Increasing this parameter will increase the probability of using the improved algorithm.
But if it is set too high, the probability that the total data capacity exceeds sort_buffer_size will increase instead, resulting in high-frequency disk I/O and low processor utilization . (Adjusted between 1024-8192)
Summary -
Reduce the query field after select (use less select *)
When the number of query fields is reduced, the buffer can hold more content, which is equivalent to indirectly increasing the sort_buffer_size
Summary A
Summary B
GROUP BY optimization
GROUP BY optimization is roughly similar to ORDER BY
- If you want to use the index when sorting and avoid Using filesort, you can use index coverage
- The order of the fields after ORDER BY /GROUP BY should be exactly the same as the order of the composite index
- The index after ORDER BY /GROUP BY must appear in order, and the index after it may not appear
- To perform ascending or descending order, the sort order of the fields must be consistent. Not part of ascending order, part of descending order, both ascending order or all descending order
- If the field in front of the composite index appears in the filter condition as a constant, the sort field can be the field immediately following it