MYSQL复习——番外章：优化相关

11.1 常用优化手段

show status：查看各个语句频率

show processlist：查看低效率的语句

explain：分析特定语句（最常用）

show profiles：查看之前各个语句的耗时

trace工具：查看具体sql执行语句时的优化细节

11.2 索引的使用建议

和底层原理有关的索引失效 / 回表问题

不要在索引列上进行运算操作，索引失效

字符串类型不加单引号，索引失效

in走索引，not in失效

11.3 SQL语句优化

大批量insert

insert语句优化

order by语句优化

group by语句优化

嵌套查询优化

limit优化

SQL提示（use, ignore, force index）

11.1 常用优化手段

分析SQL语句：explain + select语句

列名	列含义	详细解释
id	执行先后顺序	操作表的执行顺序。如果都是1，按顺序从上到下操作表。如果为3，2，1...先加载3，再2，再1
select_type	普通select / 子查询 /union /...	SIMPLE：简单的select查询，不包含子查询或UNION PRIMARY：包含子查询，外层被标记为Primary SUBQUERY：在SELECT或WHERE中包含了子查询，包含的为SUBQUERY（与PRIMARY互补） DERIVED：在FROM中包含了子查询，结果放在临时表中 UNION：eg. (SQL语句1) union (SQL语句2)，2会被标记为UNION UNION_RESULT：从UNION表获取结果的SELECT 从上到下效率越来越低
type（重要）	是否走了索引	1. NULL：不访问任何表或索引，直接返回结果。eg. select now() 直接返回时间 2. system：表中只有一行记录（比如系统表），一般不会出现 3. const：通过索引一次就找到了。根据主键或者唯一索引找到唯一的一条记录 eg1. select * from table where id = '1'; 根据主键索引查询且返回了一条记录 eg2. select * from table where name = 'james'; 根据唯一索引(name)查询并返回了一条记录 4. eq_ref：多表关联查询且只返回一条记录 eg. select * from table1 t1, table2 t2 where t1.name = t2.name; 5. ref：非唯一索引 + 返回一条或者多条 6. range：范围查询 between, > , < 等 7. index：遍历了所有数据但是走了索引 eg. select id from table 8. all：全表扫描 eg. select * from table 一般来说，我们至少要保证查询达到range级别，最好达到ref
possible_keys	可能会走的索引	可能会应用在这张表的索引，一个或多个
key	实际使用的索引	如果为NULL，表示没有用到索引
key_len	索引字段长度	索引的长度，长度越短越好
rows	查询过程中扫描的行数	越少越好
extra（重要）	其他额外的信息	using filesort: 文件排序，效率低 eg. order by using temporary：使用了临时表保存中间过程。常见于order be 和 group by using index：性能较好：使用覆盖索引 using where：在查找使用索引的情况下，需要回表去查询所需数据 using index condition：查找使用了索引，但是需要回表查询数据 using index；using where：查找使用了索引，但是需要的数据都在索引列中能找到，所以不需要回表查询 using MRR: 对eq_ref, ref, range回表查询进行了优化，具体手段是先对辅助索引查到的所有key进行排序，再回表查询。其他...

查看各个语句频率
- 待更新
定位低效率SQL语句
- 待更新
查看之前各个语句的耗时
- 待更新
查看具体sql执行语句时的优化细节
- MYSQL5.6提供了对SQL的跟踪trace，通过trace文件进一步了解为什么选择A计划而不是B计划
- 使用
  - 1. 打开trace，设置格式为JSON，并设置trace最大能使用的内存大小，避免解析过程中因为默认内存太小而不能完整展示
    - ```
    SET optimizer_trace="enabled=on", end_markers_in_json=on;
    SET optimizer_trace_max_mem_size=1000000;
```
- 2. 执行SQL语句：eg. select * from ....
- 3. 通过系统表中的information_schema.optimizer_trace可以知道MYSQL内部如何执行SQL
  - ```
  select * from information_schema.optimizer_trace\G;
```

11.2 索引的使用建议

总结（了解了索引的底层原理之后，不需要背，见第五章：https://blog.csdn.net/qq_41157876/article/details/109187544）

序号	建议
1	尽量使用复合索引，而少使用单列索引
2	全值匹配（=）：尽量使用非全值匹配（>,<）：对于符合索引范围查询右边的列不走索引，eg. where a='1' and b>'2' and c='3'，c不走索引
3	最左前缀法则：where的条件需要符合最左前缀
4	索引失效情况索引列上进行运算操作：失效不加单引号：失效模糊匹配(like%)：头部模糊会失效；尾部模糊不失效 in 和 not in：not in 会失效；而 in 不会失效
5	尽量保证覆盖索引可以在explain语句中extra字段来区分：using index condition（回表）与using index;using where（索引覆盖）
6	or后面的条件没有索引：只走前面的索引

1. 尽量全值匹配：对索引中所有列都指定具体值

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//全值匹配：查询时将a,b,c三列都作为条件，eg
select * from table where a='1' and b='2' and c='3';

2. 最左前缀法则：指的是查询从索引的最左前列开始，不能跳过索引中的列

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//条件包括a,b,c（无论顺序） ————> 走索引
select * from table where a='1' and b='2' and c='3';
select * from table where b='2' and c='3' and a='1';

//条件包括a ————> 只走a的索引
select * from table where a='1';

//条件包括a,c ————>只走a的索引（因为跳过了b）
select * from table where c='3';

//条件包括a,b ————> 只走a和b的索引
select * from table where a='1' and b='2';

//条件包括b,c ————>不走（因为跳过了a）
select * from table where b='2' and c='3';

//条件包括c ————>不走（因为跳过了a,b）
select * from table where c='3';

3. 范围查询(<, >)右边的列不走索引

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//条件包括a,b,c ————> 只走a,b的索引（范围查询b后面的字段c不走索引）
select * from table where a='1' and b>'2' and c='3';

4. 不要在索引列上进行运算操作，索引失效

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//索引失效，因为对索引字段a进行了substring运算操作
select * from table where substring(a,2,3) = '111';

5. 字符串类型不加单引号，索引失效（因为底层会采用运算操作，将不加单引号的内容转换成varchar）

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//注：a,b,c类型为varchar

//只走a的索引
select * from table where a='1' and b=2;

//都不走索引，因为都没加单引号
select * from table where a=1 and b=2;

6. 尽量使用覆盖索引，避免select *，（只访问索引的查询）

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//走a的索引,但是Extra为：using index condition（回表查询）
select * from table where a='1';
select a,b,c,d from table where a='1';

//走a的索引，Extra为：using index,using where（不需要回表查询）
select a from table where a='1';
select a, b from table where a='1';
select a, b, c from table where a='1';

7. 用or分割开来的条件，如果or前面的条件中的列有索引，而后面的列中没有索引，那么索引失效

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)


//索引失效，因为 or 后面的字段d没有索引
select * from table where a='1' or d='2';

//只走a的索引
select * from table where a='1' and d='2';

8. 模糊匹配问题（like %），尾部模糊：不失效；头部模糊：失效

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//走索引，因为%放了后面
select * from table where a like '1%'

//不走索引
select * from table where a like '%1'
select * from table where a like '%1%'

//解决方法：覆盖索引（不使用select *，而是使用索引中的列）
//都走索引，因为是覆盖索引(id为主键，自带索引）
select a from table where a like '%1'
select a,b,c from table where a like '%1%'
select id,a,b,c from table where a like '%1%'

9. 如果MySQL评估全表扫描快于使用索引，则不会使用索引
- 原因：与数据相关，如果该索引字段90%的数据都相同，则判断全表扫描优于索引，这里牵扯到Cardinality值的概念，见第五章：https://blog.csdn.net/qq_41157876/article/details/109187544
10. is NULL, is NOT NULL 有时索引失效
- 与数据相关，如果该索引字段90%的数据都为null，则用is NULL查询全表扫描优于索引
- 反之，90%为not null，则用 is NOT NULL查询全表扫描优于索引；
11. in走索引，not in失效

//比如，对table表的a,b,c字段建立联合索引
create index index123 on table(a,b,c)

//走索引,in
select * from table where a in ('1','2','3');

//索引失效,not in
select * from table where a not in ('1','2','3');

11.3 SQL语句优化

1. 大批量插入数据
- 1) 推荐：load指令导入数据
- 2) 对于innodb引擎的表，有以下几种方式可以提高导入效率
  - 按照主键顺序插入
  - 关闭唯一性校验（SET UNIQUE_CHECKS=0）
  - 手动提交事务（SET AUTOCOMMIT=0）

2. 优化insert语句

（1）多条insert语句合并
（2）事务提交改为手动提交
（3）insert有序插入

//（1）
//优化前
insert into table values(1,'a');
insert into table values(2,'b');
insert into table values(3,'c');
//优化后
insert into table values(1,'a'),(2,'b'),(3,'c');

//（2）
//优化后
start transaction;
insert into table values(1,'a');
insert into table values(2,'b');
insert into table values(3,'c');
commit;

//(3)
//优化前
insert into table values(2,'b');
insert into table values(1,'a');
insert into table values(3,'c');
//优化后
insert into table values(1,'a');
insert into table values(2,'b');
insert into table values(3,'c');

3. order by语句优化

两种排序方式Extra：(1)filesort排序；（2）using index：通过有序索引顺序扫描
总结：order by的最好方式是：覆盖索引 + order by 索引1，索引2....

create index on table(a,b);

//filesort：原因是没有用覆盖索引
select * from table order by a;
select * from table order by b;
select * from table order by a,b;

/using index：当order by 后面跟着索引并且用了覆盖索引——>走索引
select a from table order by a;
select a,b from table order by a,b;
select a,b,id from table order by a,b;

//filesort：例外1：order by多字段时，一个升序一个降序——>失效
select a from table order by a, b desc;

//filesort：例外2：order by多字段时，没有和建索引时的顺序保持一致（最左前缀）——>失效
select a from table order by b,a;

4. group by语句优化
- group by 和order一样，只是多了排序之后的分组操作。
5. 嵌套查询优化
- 尽量使用多表查询（JOIN）来替换子查询

6. or语句优化

or之间的每个条件都必须用到索引，且不能是复合索引。
因此建议使用union替换or

//优化前
select * from table where id=1 or id=2;

//优化后
select * from table where id=1 UNION select * from table where id=2;

7. limit优化

分页查询时，如limit 20000000,10 ，此时需要MYSQL排序前20000010条数据，然后丢弃前20000000条记录，仅仅返回后10条，查询代价很大

//查询第1页（10条数据）————>非常快速
select * from table limit 0,10;

//查询第100000页（10条数据）————>非常慢
select * from table limit 1000000,10;

优化思路1：

//先在索引上完成分页操作，再用搜索到的记录去查询其他列
select * from table t, (select id from table order by id limit 2000000,10) temp where t.id=temp.id;

优化思路2：

//当table主键自增时(0,1,2,....且不能断层)，
select * from table where id>2000000 limit 10;

8. SQL提示（use, ignore, force index）

//建立a,b,c联合索引
create index on table(a,b,c);

//1.use index，建议数据库走的索引
select * from table where a='1';
//对于上面语句，数据库可能会走a的索引，也可能会走a,b,c联合索引。
select * from table use index(a) where a='1'；

//2.ignore index，与1相反，eg.
select * from table ignore index(a,b,c) where a='1';

//3.force index, 强制

MYSQL复习——番外章：优化相关

11.1 常用优化手段

11.2 索引的使用建议

11.3 SQL语句优化

猜你喜欢