从一次SQL改写体会exists相关子查询的代价及MySQL优化器的“聪明之处”

最近在改写一条SQL，体会到了exists相关子查询的缺点，以及优化器聪明地根据统计信息选择了最优的执行计划。原SQL如下：

SELECT 
            DISTINCT bp_code as userCode
        FROM 
            tab 
        WHERE 
        (
            (primary_type = 'person' AND detail_type = 'inventory')
        OR
            (primary_type = 'org' AND detail_type = 'AS')
         )
        AND (end_date <  NOW() OR del_flag = '1')
        ORDER BY create_time DESC         
            LIMIT 327000,1000
1000 rows in set (1.66 sec)            
 
+----+-------------+-------------+------+----------------------------------------------------------+------+---------+------+---------+-----------------------------+
| id | select_type | table       | type | possible_keys                                            | key  | key_len | ref  | rows    | Extra                       |
+----+-------------+-------------+------+--------------------------------------------------------------------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | tab         | ALL  | PRIMARY,idx_tab_01,idx_changed_time_bc,idx_ccode         | NULL | NULL    | NULL | 1318443 | Using where; Using filesort |
+----+-------------+-------------+------+----------------------------------------------------------+------+---------+------+---------+-----------------------------+
1 row in set (0.01 sec)

先分析一下SQL，先根据where条件提取按照create_time降序的结果集，然后MySQL内部创建一个内存表，只有一个列，且列上有一个唯一索引，用于操作distinct,过程是将前面提取到的结果集的bp_code从一条开始，逐条放进内存表中，遇到重复的bp_code则跳过并处理下一条，直到取到第327000条再往后1000条的唯一的bp_code。
但是执行计划中并没有创建临时表的操作，只是用到了Using where; Using filesort，为什么少了临时表那一步呢？
我们都知道，临时表的作用只是为了排除重复的bp_code，但是这里经过检查发现，bp_code实际上是tab表的主键，本身就拥有唯一的属性，也就是说每个获取到的bp_code都是唯一的，所以就不需要再经过临时表的过滤了。

回到优化SQL的角度上，这个SQL的where条件过于混乱，很难设计合适的索引，这一点需要开发人员配合进行SQL拆分才能进一步优化。

整个SQL的执行过程是先全表扫描，然后按照create_time将结果集排序，最后分页取出一定数量的bp_code。

那么，能不能避免全扫描呢？我尝试了如下的改写：

select  a.bp_code from  tab a where exists (select 1 from tab b where a.bp_code=b.bp_code and  (
            (b.primary_type = 'person' AND b.detail_type = 'inventory')
        OR
            (b.primary_type = 'org' AND b.detail_type = 'AS')
         )
        AND (b.end_date <  NOW() OR b.del_flag = '1')
        )   
        order by a.create_time DESC
         LIMIT 327000,1000;

并且配合着创建了create_time上的索引idx_create_time。执行计划如下：

+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
| id | select_type        | table | type   | possible_keys              | key             | key_len | ref                  | rows   | Extra                    |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
|  1 | PRIMARY            | a     | index  | NULL                       | idx_create_time | 6       | NULL                 | 328000 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | b     | eq_ref | PRIMARY,idx_tab_01         | PRIMARY         | 152     | a.bp_code            |      1 | Using where              |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+--------+--------------------------+
2 rows in set (0.00 sec)

由于结果集需要按照create_time排序，而我又在create_time上创建了索引，所以这个SQL的执行过程如下：
1.直接在索引idx_create_time的最右端开始，即create_time最大值处，按序往左读取bp_code;
2.将读取到的bp_code进入子查询去确认这一个bp_code是否符合where条件，这里的相关子查询由于是主键的关联，所以用到主键索引，定位到唯一的一条数据，再确认是否符合where条件;
3.如果bp_code让相关子查询成立，则是目标数据；如果相关子查询不成立，则忽略；直到取得bp_code的数量符合LIMIT 327000,1000为止。

改写后的SQL的好处是，不需要全表扫描，只需要取得符合LIMIT 327000,1000就可以停止了。

乍一看，似乎改写后是更优的。但是执行的结果是：1000 rows in set (4.43 sec)，比原SQL更慢了。

我用profiling看了一下改写后的执行过程信息：

+--------------------+----------+
| Status             | Duration |
+--------------------+----------+
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000014 |
| executing          | 0.000009 |
| Sending data       | 0.000021 |
| executing          | 0.000010 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000013 |
| executing          | 0.000009 |
| Sending data       | 0.000012 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000014 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000011 |
| executing          | 0.000009 |
| Sending data       | 0.000019 |
| end                | 0.000012 |
| query end          | 0.000011 |
| closing tables     | 0.000015 |
| freeing items      | 0.000039 |
| logging slow query | 0.000041 |
| cleaning up        | 0.000016 |
+--------------------+----------+
100 rows in set, 1 warning (0.00 sec)

可以看出，改写后的SQL效率慢的原因是，每次取得bp_code，都需要在相关子查询里根据主键索引回表，而且limit子句的起点太大，根据数据分布情况，要取得符合where条件和limit 327000,1000的bp_code，几乎要遍历整个idx_create_time，也就大概相当于先全索引扫描idx_create_time，然后在全索引扫描主键索引，而在这个SQL中的全索引扫描主键索引并不是在叶子块根据指针往前遍历的方式，而是在idx_create_time每次拿到bp_code，然后再去主键索引里关联匹配，即每一个bp_code都需要从主键索引的根节点到叶节点的一个遍历，效率慢得多。

通过上面的分析，我们知道了改写后效率慢的原因，但是毕竟改写后避免了全表扫描，而且还避免了排序，那如果limit的起点小的话，需要拿到的bp_code数量也就少，也就是说，SQL越早停止扫描idx_create_time，即扫描越少的数据，那么在limit起点小的时候，应该会比原SQL的效率要高吧？

我们来看一下执行时间的对比：
1.原SQL

SELECT 
            DISTINCT bp_code as userCode
        FROM 
            tab 
        WHERE 
        (
            (primary_type = 'person' AND detail_type = 'inventory')
        OR
            (primary_type = 'org' AND detail_type = 'AS')
         )
        AND (end_date <  NOW() OR del_flag = '1')
        ORDER BY create_time DESC         
            LIMIT 0,1000
1000 rows in set (0.02 sec)

2.改写后的SQL：

select  a.bp_code from  tab a where exists (select 1 from tab b where a.bp_code=b.bp_code and  (
            (b.primary_type = 'person' AND b.detail_type = 'inventory')
        OR
            (b.primary_type = 'org' AND b.detail_type = 'AS')
         )
        AND (b.end_date <  NOW() OR b.del_flag = '1')
        )   
        order by a.create_time DESC
         LIMIT 0,1000;
1000 rows in set (0.04 sec)

改写后的SQL确实是预料中的快了，但改写后的SQL还是慢于原SQL。而且意想不到的是，原SQL不是要全表扫描吗？怎么变得这么快了？
我们来看一下执行计划：
1.原SQL：

+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table       | type  | possible_keys                                                            | key             | key_len | ref  | rows | Extra       |
+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
|  1 | SIMPLE      | tab         | index | PRIMARY,idx_tab_01,idx_changed_time_bc,idx_ccode,idx_create_time         | idx_create_time | 6       | NULL | 1853 | Using where |
+----+-------------+-------------+-------+--------------------------------------------------------------------------+-----------------+---------+------+------+-------------+
1 row in set (0.01 sec)

2.改写后的SQL：

+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
| id | select_type        | table | type   | possible_keys              | key             | key_len | ref                  | rows | Extra                    |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
|  1 | PRIMARY            | a     | index  | NULL                       | idx_create_time | 6       | NULL                 | 1000 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | b     | eq_ref | PRIMARY,idx_tab_01         | PRIMARY         | 152     | a.bp_code            |    1 | Using where              |
+----+--------------------+-------+--------+----------------------------+-----------------+---------+----------------------+------+--------------------------+
2 rows in set (0.00 sec)

原来，原SQL用上了新建的索引idx_create_time，不再是全表扫描。那么，在这种情况的执行过程是这样的：
从idx_create_time最右端往左按序读取bp_code，然后返回主键索引判断是否符合where条件，直到取得符合LIMIT 0,1000的数据。这里MySQL优化器很聪明地知道limit起点低，相对于整表的一百多万数据，只是一小部分，大概率只需要遍历一小部分数据就可以得到目标数据，所以就避开了全表扫描的方式，而采用idx_create_time在避免排序的同时，按需取数即可。
但是当limit起点越来越大时，回表的开销就会扩大，甚至大于直接全表扫描，所以即使有idx_create_time索引的情况下，对于原SQL的limit 327000,1000还是直接走了全表扫描，这里并不能说全表扫描是不好的，应该说，MySQL优化器已经选择了最好的执行计划，那就是全表扫描。
所以索引idx_create_time只能优化分页查询前面一小部分区间的效率。
那么如果想要这个SQL执行效率更稳定，或许需要开发人员对where条件作出调整，好设计合适的索引；或者从分页标签的角度去优化，但这个角度暂时还没找到比较好的办法，需要继续研究一下。

这样看起来，在limit起点低的时候，原SQL和改写后的SQL执行逻辑似乎是相同的，那为什么原SQL会快一点呢？原因是改写后的SQL多了一次where判断，这可以从执行计划中看出来。

从一次SQL改写体会exists相关子查询的代价及MySQL优化器的“聪明之处”

猜你喜欢