MySQL optimization practice that you don't know is not allowed (2)



insert image description here

Don't say anything extra, just serve the dishes!


11. When using a joint index, pay attention to the order of the index columns, and generally follow the leftmost matching principle.

Table structure: (there is a joint index idxuseridage, userId comes first, age follows)

CREATE TABLE `user` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `userId` int(11) NOT NULL,  `age` int(11) DEFAULT NULL,  `name` varchar(255) NOT NULL,  PRIMARY KEY (`id`),  KEY `idx_userid_age` (`userId`,`age`) USING BTREE) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;

Counter example:

select * from user where age = 10;

insert image description here

Positive example:

//符合最左匹配原则select * from user where userid=10 and age =10;//符合最左匹配原则select * from user where userid =10;

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-rIFQEWMQ-1689213442322)(img/image-20200416004457692.png)]

reason:

  • When we create a joint index, such as (k1, k2, k3), it is equivalent to creating three indexes (k1), (k1, k2) and (k1, k2, k3), which is the leftmost matching principle.
  • The joint index does not satisfy the leftmost principle, and the index will generally fail, but this is also related to the Mysql optimizer.

12. To optimize the query, you should consider building indexes on the columns involved in where and order by, and try to avoid full table scanning.

Counter example:

select * from user where address ='深圳' order by age ;

insert image description here

Positive example:

添加索引alter table user add index idx_address_age (address,age)

insert image description here


13. If too much data is inserted, consider batch insertion.

Counter example:

for(User u :list){ INSERT into user(name,age) values(#name#,#age#)}

Positive example:

//一次500批量插入,分批进行
insert into user(name,age) values
<foreach collection="list" item="user" index="index" separator=",">  
(#{user.name},#{user.age})
</foreach>

insert into user(name,age) values("zs",20),("ls",21)

reason:

  • Batch insertion performance is good, saving time

To make an analogy: If you need to move 10,000 bricks to the top of the building, you have an elevator, and the elevator can put an appropriate amount of bricks (up to 500) at a time. You can choose to transport one brick at a time, or you can transport 500 bricks at a time. Which do you think takes the most time?


14. When appropriate, use covering indexes.

Covering indexes can make your SQL statements do not need to return to the table, and you can get all the required data just by accessing the index, which greatly improves the query efficiency.

Counter example:

// like模糊查询,不走索引了select * from user where userid like '%123%'

insert image description here

Positive example:

//id为主键,那么为普通索引,即覆盖索引登场了。select id,name from user where userid like '%123%';

insert image description here


15. Use the distinct keyword with caution

The distinct keyword is generally used to filter duplicate records to return non-duplicate records. When used in the case of querying one field or few fields, it brings optimization effect to the query. However, when there are many fields, it will greatly reduce the query efficiency.

Counter example:

SELECT DISTINCT * from  user;

Positive example:

select DISTINCT name from user;

reason:

  • The cpu time and occupied time of the statement with distinct are higher than those without distinct. Because when querying many fields, if you use distinct, the database engine will compare the data and filter out duplicate data. However, this comparison and filtering process will occupy system resources and cpu time.

16. Delete redundant and duplicate indexes

Counter example:

  KEY `idx_userId` (`userId`)    KEY `idx_userId_age` (`userId`,`age`)

Positive:

  //删除userId索引,因为组合索引(A,B)相当于创建了(A)和(A,B)索引  KEY `idx_userId_age` (`userId`,`age`)

reason:

  • Duplicate indexes need to be maintained, and the optimizer also needs to consider them one by one when optimizing queries, which will affect performance.

17. If the amount of data is large, optimize your modify/delete statements.

Avoid modifying or deleting too much data at the same time, because it will cause high cpu utilization, which will affect other people's access to the database.

Counter example:

//一次删除10万或者100万+?delete from user where id <100000;//或者采用单一循环操作,效率低,时间漫长for(User user:list){   delete from user; }

Positive example:

//分批进行删除,如每次500delete user where id<500delete product where id>=500 and id<1000;

reason:

  • If you delete too much data at one time, there may be an error of lock wait timeout exceed, so it is recommended to operate in batches.

18. Consider using the default value instead of null in the where clause.

Counter example:

select * from user where age is not null;

insert image description here

Positive example:

//设置0为默认值select * from user where age>0;

insert image description here

reason:

  • It does not mean that if you use is null or is not null, you will not use the index. This is related to the mysql version and the query cost.

If the mysql optimizer finds that the cost of using the index is higher than that of not using the index, it will definitely give up the index. These conditions are !=,>isnull,isnotnulloften considered to make the index invalid. In fact, it is because the query cost is high under normal circumstances, and the optimizer automatically gives up the index.

  • If you replace the null value with the default value, it is often possible to go through the index, and at the same time, the meaning of the expression will be relatively clear.

19. Do not have more than 5 table connections

  • The more linked tables, the greater the compilation time and overhead.
  • Breaking up the connection table into smaller executions is more readable.
  • If you must join many tables to get the data, it means bad design.

20. Rational use of exist&in

Suppose table A represents the employee table of a certain company, and table B represents the department table. To query all employees in all departments, it is easy to have the following SQL:

select * from A where deptId in (select deptId from B);

Writing this is equivalent to:

Query department table B first

select deptId from B

Then use the department deptId to query the employees of A

select * from A where A.deptId = B.deptId

It can be abstracted into a loop like this:

   List<> resultSet ;   
for(int i=0;i<B.length;i++) {
    
            
  for(int j=0;j<A.length;j++) {
    
          
    if(A[i].id==B[j].id) {
    
               
      resultSet.add(A[i]);            
      break;         
    }      
  }    
}

Obviously, in addition to using in, we can also use exists to achieve the same query function, as follows:

select * from A where exists (select 1 from B where A.deptId = B.deptId);

Because the understanding of the exists query is to execute the main query first, and then put it into the subquery for conditional verification after obtaining the data. According to the verification result (true or false), it is determined whether the data result of the main query is preserved.

Then, writing this is equivalent to:

select * from A, first do a loop from the A table

select * from B where A.deptId = B.deptId, and then loop from the B table.

In the same way, it can be abstracted into such a cycle:

List<> resultSet ;    
for(int i=0;i<A.length;i++) {
    
              
  for(int j=0;j<B.length;j++) {
    
              
    if(A[i].deptId==B[j].deptId) {
    
                 
      resultSet.add(A[i]);             
      break;          
    }      
  }    
}

The most strenuous part of the database is to link with the program and release it. Assume that you have linked twice, each time you do millions of data set queries, and leave after checking, so you only do it twice; on the contrary, you have established millions of links, and the application for link release is repeated, so the system is affected. no more. That is, the mysql optimization principle is that small tables drive large tables, and small data sets drive large data sets, so that performance is better.

Therefore, we need to choose the one with the smallest outer loop, that is, if the data volume of B is smaller than A, it is suitable to use in, and if the data volume of B is greater than A, it is suitable to choose exist .



insert image description here

Guess you like

Origin blog.csdn.net/m0_60915009/article/details/131718857