从一次SQL改写体会exists相关子查询的代价及MySQL优化器的“聪明之处”

版权声明:本文原创,转载请注明出处。 https://blog.csdn.net/weixin_39004901/article/details/89137852

最近在改写一条SQL,体会到了exists相关子查询的缺点,以及优化器聪明地根据统计信息选择了最优的执行计划。原SQL如下:

SELECT 
            DISTINCT bp_code as userCode
        FROM 
            tab 
        WHERE 
        (
            (primary_type = 'person' AND detail_type = 'inventory')
        OR
            (primary_type = 'org' AND detail_type = 'AS')
         )
        AND (end_date <  NOW() OR del_flag = '1')
        ORDER BY create_time DESC         
            LIMIT 327000,1000
1000 rows in set (1.66 sec)            
 
+----+-------------+-------------+------+----------------------------------------------------------+------+---------+------+---------+-----------------------------+
| id | select_type | table       | type | possible_keys                                            | key  | key_len | ref  | rows    | Extra                       |
+----+-------------+-------------+------+--------------------------------------------------------------------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | tab         | ALL  | PRIMARY,idx_tab_01,idx_changed_time_bc,idx_ccode         | NULL | NULL    | NULL | 1318443 | Using where; Using filesort |
+----+-------------+-------------+------+----------------------------------------------------------+------+---------+------+---------+-----------------------------+
1 row in set (0.01 sec)  

先分析一下SQL,先根据where条件提取按照create_time降序的结果集,然后MySQL内部创建一个内存表,只有一个列,且列上有一个唯一索引,用于操作distinct,过程是将前面提取到的结果集的bp_code从一条开始,逐条放进内存表中,遇到重复的bp_code则跳过并处理下一条,直到取到第327000条再往后1000条的唯一的bp_code。
但是执行计划中并没有创建临时表的操作,只是用到了Using where; Using filesort,为什么少了临时表那一步呢?
我们都知道,临时表的作用只是为了排除重复的bp_code,但是这里经过检查发现,bp_code实际上是tab表的主键,本身就拥有唯一的属性,也就是说每个获取到的bp_code都是唯一的,所以就不需要再经过临时表的过滤了。

回到优化SQL的角度上,这个SQL的where条件过于混乱,很难设计合适的索引,这一点需要开发人员配合进行SQL拆分才能进一步优化。

整个SQL的执行过程是先全表扫描,然后按照create_time将结果集排序,最后分页取出一定数量的bp_code。

那么,能不能避免全扫描呢?我尝试了如下的改写:

select  a.bp_code from  tab a where exists (select 1 from tab b where a.bp_code=b.bp_code and  (
            (b.primary_type = 'person' AND b.detail_type = 'inventory')
        OR
            (b.primary_type = 'org' AND b.detail_type = 'AS')
         )
        AND (b.end_date <  NOW() OR b.del_flag = '1')
        )   
        order by a.create_time DESC
         LIMIT 327000,1000;

并且配合着创建了create_time上的索引idx_create_time。执行计划如下:

+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
| id | select_type        | table | type   | possible_keys              | key             | key_len | ref                  | rows   | Extra                    |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
|  1 | PRIMARY            | a     | index  | NULL                       | idx_create_time | 6       | NULL                 | 328000 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | b     | eq_ref | PRIMARY,idx_tab_01         | PRIMARY         | 152     | a.bp_code            |      1 | Using where              |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
2 rows in set (0.00 sec)

由于结果集需要按照create_time排序,而我又在create_time上创建了索引,所以这个SQL的执行过程如下:
1.直接在索引idx_create_time的最右端开始,即create_time最大值处,按序往左读取bp_code;
2.将读取到的bp_code进入子查询去确认这一个bp_code是否符合where条件,这里的相关子查询由于是主键的关联,所以用到主键索引,定位到唯一的一条数据,再确认是否符合where条件;
3.如果bp_code让相关子查询成立,则是目标数据;如果相关子查询不成立,则忽略;直到取得bp_code的数量符合LIMIT 327000,1000为止。

改写后的SQL的好处是,不需要全表扫描,只需要取得符合LIMIT 327000,1000就可以停止了。

乍一看,似乎改写后是更优的。但是执行的结果是:1000 rows in set (4.43 sec),比原SQL更慢了。

我用profiling看了一下改写后的执行过程信息:

+--------------------+----------+
| Status             | Duration |
+--------------------+----------+
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000014 |
| executing          | 0.000009 |
| Sending data       | 0.000021 |
| executing          | 0.000010 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000014 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000019 |
| end                | 0.000012 |
| query end          | 0.000011 |
| closing tables     | 0.000015 |
| freeing items      | 0.000039 |
| logging slow query | 0.000041 |
| cleaning up        | 0.000016 |
+--------------------+----------+
100 rows in set, 1 warning (0.00 sec)   

可以看出,改写后的SQL效率慢的原因是,每次取得bp_code,都需要在相关子查询里根据主键索引回表,而且limit子句的起点太大,根据数据分布情况,要取得符合where条件和limit 327000,1000的bp_code,几乎要遍历整个idx_create_time,也就大概相当于先全索引扫描idx_create_time,然后在全索引扫描主键索引,而在这个SQL中的全索引扫描主键索引并不是在叶子块根据指针往前遍历的方式,而是在idx_create_time每次拿到bp_code,然后再去主键索引里关联匹配,即每一个bp_code都需要从主键索引的根节点到叶节点的一个遍历,效率慢得多。

通过上面的分析,我们知道了改写后效率慢的原因,但是毕竟改写后避免了全表扫描,而且还避免了排序,那如果limit的起点小的话,需要拿到的bp_code数量也就少,也就是说,SQL越早停止扫描idx_create_time,即扫描越少的数据,那么在limit起点小的时候,应该会比原SQL的效率要高吧?

我们来看一下执行时间的对比:
1.原SQL

SELECT 
            DISTINCT bp_code as userCode
        FROM 
            tab 
        WHERE 
        (
            (primary_type = 'person' AND detail_type = 'inventory')
        OR
            (primary_type = 'org' AND detail_type = 'AS')
         )
        AND (end_date <  NOW() OR del_flag = '1')
        ORDER BY create_time DESC         
            LIMIT 0,1000
1000 rows in set (0.02 sec)            

2.改写后的SQL:

select  a.bp_code from  tab a where exists (select 1 from tab b where a.bp_code=b.bp_code and  (
            (b.primary_type = 'person' AND b.detail_type = 'inventory')
        OR
            (b.primary_type = 'org' AND b.detail_type = 'AS')
         )
        AND (b.end_date <  NOW() OR b.del_flag = '1')
        )   
        order by a.create_time DESC
         LIMIT 0,1000;
1000 rows in set (0.04 sec)         

改写后的SQL确实是预料中的快了,但改写后的SQL还是慢于原SQL。而且意想不到的是,原SQL不是要全表扫描吗?怎么变得这么快了?
我们来看一下执行计划:
1.原SQL:

+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table       | type  | possible_keys                                                            | key             | key_len | ref  | rows | Extra       |
+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
|  1 | SIMPLE      | tab         | index | PRIMARY,idx_tab_01,idx_changed_time_bc,idx_ccode,idx_create_time         | idx_create_time | 6       | NULL | 1853 | Using where |
+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
1 row in set (0.01 sec)

2.改写后的SQL:

+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
| id | select_type        | table | type   | possible_keys              | key             | key_len | ref                  | rows | Extra                    |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
|  1 | PRIMARY            | a     | index  | NULL                       | idx_create_time | 6       | NULL                 | 1000 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | b     | eq_ref | PRIMARY,idx_tab_01         | PRIMARY         | 152     | a.bp_code            |    1 | Using where              |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
2 rows in set (0.00 sec)

原来,原SQL用上了新建的索引idx_create_time,不再是全表扫描。那么,在这种情况的执行过程是这样的:
从idx_create_time最右端往左按序读取bp_code,然后返回主键索引判断是否符合where条件,直到取得符合LIMIT 0,1000的数据。这里MySQL优化器很聪明地知道limit起点低,相对于整表的一百多万数据,只是一小部分,大概率只需要遍历一小部分数据就可以得到目标数据,所以就避开了全表扫描的方式,而采用idx_create_time在避免排序的同时,按需取数即可。
但是当limit起点越来越大时,回表的开销就会扩大,甚至大于直接全表扫描,所以即使有idx_create_time索引的情况下,对于原SQL的limit 327000,1000还是直接走了全表扫描,这里并不能说全表扫描是不好的,应该说,MySQL优化器已经选择了最好的执行计划,那就是全表扫描。
所以索引idx_create_time只能优化分页查询前面一小部分区间的效率。
那么如果想要这个SQL执行效率更稳定,或许需要开发人员对where条件作出调整,好设计合适的索引;或者从分页标签的角度去优化,但这个角度暂时还没找到比较好的办法,需要继续研究一下。

这样看起来,在limit起点低的时候,原SQL和改写后的SQL执行逻辑似乎是相同的,那为什么原SQL会快一点呢?原因是改写后的SQL多了一次where判断,这可以从执行计划中看出来。

猜你喜欢

转载自blog.csdn.net/weixin_39004901/article/details/89137852