关注我的微信公众号:pythonislover,领取python,大数据,SQL优化相关视频资料!~
Python大数据与SQL优化笔 QQ群:771686295
谓语推入,估计有很多人,对这个术语都听过,但是却迷迷糊糊,好像知道,但是想说却说不出。
下面我试试说一说。
首先我要把谓语推入分成两种:
1.常数值的谓语推入
2.连接键的谓语推入
下面我们一个一个来说
什么叫常数值的谓语推入
SQL> create view v_predicate_push as
select e.* , d.DNAME,d.LOC
from emp e inner join dept d
on e.DEPTNO = d.DEPTNO; 2 3 4
View created.
SQL> explain plan for
select * from v_predicate_push where EMPNO=7499;
2
Explained.
SQL> set linesize 100
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 2385808155
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 58 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 1 | 58 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E"."EMPNO"=7499)
5 - access("E"."DEPTNO"="D"."DEPTNO")
18 rows selected.
SQL> create view v_predicate_push_1 as
select e.* , d.DNAME,d.LOC
from emp e inner join dept d
on e.DEPTNO = d.DEPTNO
union all
select e.* , d.DNAME,d.LOC
from emp e inner join dept d
on e.DEPTNO = d.DEPTNO; 2 3 4 5 6 7 8
View created.
SQL> explain plan for
select * from v_predicate_push_1 where EMPNO=7499; 2
Explained.
SQL> set linesize 400
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3403934086
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 208 | 4 (0)| 00:00:01 |
| 1 | VIEW | V_PREDICATE_PUSH_1 | 2 | 208 | 4 (0)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | NESTED LOOPS | | 1 | 58 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 6 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 8 | NESTED LOOPS | | 1 | 58 | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 |
|* 10 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
| 11 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 1 (0)| 00:00:01 |
|* 12 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
5 - access("E"."EMPNO"=7499)
7 - access("E"."DEPTNO"="D"."DEPTNO")
10 - access("E"."EMPNO"=7499)
12 - access("E"."DEPTNO"="D"."DEPTNO")
27 rows selected.
大家仔细看看上面两段代码有什么区别,第一段的VIEW只是一个普通的VIEW, 第二段的VIEW我只是赋值了下代码加了个union all, 这里的union all是个关键,前面的文章我不止一次的说过union all可以阻止视图的合并,让视图作为一个整体运行。
所以这里要强调下谓语推入的一个前提条件:存在一个不可以合并的视图
看第一段SQL的执行计划,发现执行计划里没有VIEW的关键字,说明视图已经被优化器展开了。
第二段执行计划里有VIEW的关键字,而且发生了我们上面说的常数值的谓语推入,怎么看出来的呢?
按道理说我们的where EMPNO=7499; 是加在VIEW后面的, 执行计划应该是把VIEW执行完再过滤, 也就是ID 1前面会有个*号, 说明有过滤行为,但是上面的执行计划却没有,而是在ID 5,7,10,12去走索引,这说明EMPNO=7499 这个条件被推入到视图里面的表去先过滤了,这就是常数值的谓语推入。
常数值的谓语的谓语推入一般是对性能有利的,因为单纯的表的数据会提前过滤,然后和后面的结果集或者表去join, 这里如果不走谓语推入,那么2个表只能走全表扫描。
什么叫连接键的谓语推入
这个是今天的重点了,因为上面的常数值的谓语推入一般是对性能有利的,但是连接键的谓语推入可就不一定了,可能会坑人。
#准备环境
create table t4 as select * from emp;
create table t2 as select * from emp;
create index idx14 on t4(empno);
create index idx12 on t2(empno);
create view v_normal as #普通视图
select * from t2;
create view v_union_all as #不可合并的视图
select * from t2
union all
select * from t4;
SQL> explain plan for
select emp.empno
from emp inner join v_union_all
on emp.empno = v_union_all.empno
and emp.ename='SMITH' 2 3 4 5
6 ;
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2174267700
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 24 | 5 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 2 | 24 | 5 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 10 | 3 (0)| 00:00:01 |
| 3 | VIEW | V_UNION_ALL | 1 | 2 | 2 (0)| 00:00:01 |
| 4 | UNION ALL PUSHED PREDICATE | | | | | |
|* 5 | INDEX RANGE SCAN | IDX12 | 1 | 13 | 1 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|* 6 | INDEX RANGE SCAN | IDX14 | 1 | 13 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."ENAME"='SMITH')
5 - access("EMPNO"="EMP"."EMPNO")
6 - access("EMPNO"="EMP"."EMPNO")
Note
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
- dynamic sampling used for this statement (level=2)
看看上面的代码ID5,6走的index access,而且是"EMNO"="EMP"."EMPNO", 这个条件就是外面的emp表通过emp.empno = v_union_all.empno 连接条件传进去的,不然的话,里面的两个表只能走全表扫描。
也许有人说这不是蛮好的吗? 表都走了索引,性能肯定好啊, 但是你可能忽略了一点表和VIEW的连接方式现在是“NEST LOOP” , 为什么会走这个方式呢? 因为是外面的表通过连接键到视图里面过滤,这样只能走NEST LOOP, 想想一下,如果有一天, 你的优化评估出错了,emp外表本来1000W数据,但是优化器却认为是100条,你这肯定走了谓语推入的执行计划了,但是emp 1000W次,如果没有过滤的话,内部的VIEW要被扫描1000W次, 就算是有索引,估计性能也不咋的。
这就是我说的连接键谓语推入的性能点,希通过上面你能明白上面是谓语推入和谓语推入的分类。
最后说一点不是所以得SQL都可以走谓语推入的。要符合下面的要求才行:
(1)视图定义SQL语句中包含UNIONALL/UNION的视图
(2)视图定义SQL语句中包含DISTINCT的视图
(3)视图定义SQL语句中包含GROUP BY的视图
(4)和外部查询之间的连接类型是外连接的视图
(5)和外部查询之间的连接方法是反连接的视图
(6)和外部查询之间的连接方法是半连接的视图
上面我的例子就是视图里有UNION ALL的,下面我举个不能走的例子
SQL> explain plan for
select /*+ no_merge(v_normal) push_pred(v_normal) */emp.empno
from emp inner join v_normal
on emp.empno = v_normal.empno
and emp.ename='SMITH'; 2 3 4 5
Explained.
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2819220473
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 23 | 6 (17)| 00:00:01 |
| 1 | MERGE JOIN | | 1 | 23 | 6 (17)| 00:00:01 |
|* 2 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 10 | 2 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | PK_EMP | 14 | | 1 (0)| 00:00:01 |
|* 4 | SORT JOIN | | 15 | 195 | 4 (25)| 00:00:01 |
| 5 | VIEW | V_NORMAL | 15 | 195 | 3 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 6 | TABLE ACCESS FULL | T2 | 15 | 195 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."ENAME"='SMITH')
4 - access("EMP"."EMPNO"="V_NORMAL"."EMPNO")
filter("EMP"."EMPNO"="V_NORMAL"."EMPNO")
Note
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
- dynamic sampling used for this statement (level=2)
24 rows selected.
上面SQL我加了/*+ no_merge(v_normal) push_pred(v_normal) */ 让视图不合并(最后VIEW确实没合并),push_pred是强制走谓语推入的hint,
但是最后SQL还是没有走谓语推入,因为这个SQL不符合上面的条件。
SQL> explain plan for
select /*+ no_merge(v_normal) push_pred(v_normal) */emp.empno
from emp left join v_normal
on emp.empno = v_normal.empno
and emp.ename='SMITH';
2 3 4 5
Explained.
SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3405351543
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 168 | 17 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 14 | 168 | 17 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 14 | 140 | 3 (0)| 00:00:01 |
| 3 | VIEW PUSHED PREDICATE | V_NORMAL | 1 | 2 | 1 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IDX12 | 1 | 13 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("EMPNO"="EMP"."EMPNO")
filter(CASE WHEN "EMPNO" IS NOT NULL THEN 'SMITH' ELSE 'SMITH' END
="EMP"."ENAME")
Note
-----
- dynamic sampling used for this statement (level=2)
把emp和view的join方式换成left join,之后,立马Ok了,神奇吗? 希望下次不要踩坑啊,说为什么我的SQL就是不用谓语推入
最后总结:
-
常数值的谓语推入一般没关系,对性能有利
-
执行计划中出现“VIEW PUSHED PREDICATE”的关键字,你要注意,看看SQL跑的快慢,如果快那就无所谓,如果慢,就要看看是不是谓语推入导致的NEST LOOP导致的。
-
自己查查push_pred hint的用法
今天写到这,个人见解,望指正