今天,某省的同事来告诉我,表重组后,他用于统计的一个sql脚本运行变慢了,之前只需要17、8分钟能出来的结果,现在1小时40分钟左右才能出来结果。
我们一起来看看脚本中的一个sql:
SQL
>
explain
plan
for
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.03
SQL > select * from table ( dbms_xplan . display )
SQL > /
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 369 | 23985 | 980 | | |
| 1 | NESTED LOOPS | | 369 | 23985 | 980 | | |
| 2 | PARTITION HASH ALL | | | | | 1 | 4 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_1 | 369 | 14022 | 242 | 1 | 4 |
|* 4 | INDEX RANGE SCAN | IDX_tab_1_SERVID | 492 | | 10 | 1 | 4 |
| 5 | PARTITION HASH ITERATOR | | | | | KEY | KEY |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_2 | 1 | 27 | 2 | KEY | KEY |
|* 7 | INDEX UNIQUE SCAN | PK_tab_2_MID | 1 | | 1 | KEY | KEY |
-------------------------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
3 - filter ( " A " . " STATUS " <> ' C ' )
4 - access ( " A " . " SERVID " = ' 025001003681 ' )
7 - access ( " A " . " MID " = " B " . " MID " )
Note : cpu costing is off
22 rows selected .
Elapsed : 00 : 00 : 00.56
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.03
SQL > select * from table ( dbms_xplan . display )
SQL > /
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 369 | 23985 | 980 | | |
| 1 | NESTED LOOPS | | 369 | 23985 | 980 | | |
| 2 | PARTITION HASH ALL | | | | | 1 | 4 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_1 | 369 | 14022 | 242 | 1 | 4 |
|* 4 | INDEX RANGE SCAN | IDX_tab_1_SERVID | 492 | | 10 | 1 | 4 |
| 5 | PARTITION HASH ITERATOR | | | | | KEY | KEY |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_2 | 1 | 27 | 2 | KEY | KEY |
|* 7 | INDEX UNIQUE SCAN | PK_tab_2_MID | 1 | | 1 | KEY | KEY |
-------------------------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
3 - filter ( " A " . " STATUS " <> ' C ' )
4 - access ( " A " . " SERVID " = ' 025001003681 ' )
7 - access ( " A " . " MID " = " B " . " MID " )
Note : cpu costing is off
22 rows selected .
Elapsed : 00 : 00 : 00.56
我们看到这个sql是通过索引后在走nested loops,我们做一个sqltrace来观察一下它的执行时间和consistent gets:
SQL
>
set
timing
on
SQL > set autotrace traceonly
SQL > select a . startdate , b . subsid from tab_1 a , tab_2 b where
2 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
627965 rows selected .
Elapsed : 00 : 36 : 13.04
Execution Plan
-------------------------------------------------------- --
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=980 Card=369 Bytes=2
3985 )
1 0 NESTED LOOPS ( Cost = 980 Card = 369 Bytes = 23985 )
2 1 PARTITION HASH ( ALL )
3 2 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_1 '
( Cost = 242 Card = 369 Bytes = 14022 )
4 3 INDEX ( RANGE SCAN ) OF ' IDX_tab_1_SERVID ' ( NON
- UNIQUE ) ( Cost = 10 Card = 492 )
5 1 PARTITION HASH ( ITERATOR )
6 5 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_2 ' ( C
ost = 2 Card = 1 Bytes = 27 )
7 6 INDEX ( UNIQUE SCAN ) OF ' PK_tab_2_MID ' ( UNIQUE )
( Cost = 1 Card = 1 )
Statistics
-------------------------------------------------------- --
0 recursive calls
0 db block gets
2725638 consistent gets
406711 physical reads
1120 redo size
20890925 bytes sent via SQL * Net to client
461160 bytes received via SQL * Net from client
41866 SQL * Net roundtrips to / from client
0 sorts ( memory )
0 sorts ( disk )
627965 rows processed
SQL >
SQL > set autotrace traceonly
SQL > select a . startdate , b . subsid from tab_1 a , tab_2 b where
2 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
627965 rows selected .
Elapsed : 00 : 36 : 13.04
Execution Plan
-------------------------------------------------------- --
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=980 Card=369 Bytes=2
3985 )
1 0 NESTED LOOPS ( Cost = 980 Card = 369 Bytes = 23985 )
2 1 PARTITION HASH ( ALL )
3 2 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_1 '
( Cost = 242 Card = 369 Bytes = 14022 )
4 3 INDEX ( RANGE SCAN ) OF ' IDX_tab_1_SERVID ' ( NON
- UNIQUE ) ( Cost = 10 Card = 492 )
5 1 PARTITION HASH ( ITERATOR )
6 5 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_2 ' ( C
ost = 2 Card = 1 Bytes = 27 )
7 6 INDEX ( UNIQUE SCAN ) OF ' PK_tab_2_MID ' ( UNIQUE )
( Cost = 1 Card = 1 )
Statistics
-------------------------------------------------------- --
0 recursive calls
0 db block gets
2725638 consistent gets
406711 physical reads
1120 redo size
20890925 bytes sent via SQL * Net to client
461160 bytes received via SQL * Net from client
41866 SQL * Net roundtrips to / from client
0 sorts ( memory )
0 sorts ( disk )
627965 rows processed
SQL >
我们看到有270多万个consistent gets,并且这个sql也确实要36分钟才能运行出来,那为什么同事说之前10多分钟就能跑出来?我们来试试hash连接:
SQL
>
select
/*+ use_hash(a,b) */
a
.
startdate
,
b
.
subsid
from
tab_1
a
,
tab_2
b
where
a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ; 2
628514 rows selected .
Elapsed : 00 : 04 : 21.90
Execution Plan
-------------------------------------------------------- --
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20594 Card=369 Bytes
= 23985 )
1 0 HASH JOIN ( Cost = 20594 Card = 369 Bytes = 23985 )
2 1 PARTITION HASH ( ALL )
3 2 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_1 '
( Cost = 242 Card = 369 Bytes = 14022 )
4 3 INDEX ( RANGE SCAN ) OF ' IDX_tab_1_SERVID ' ( NON
- UNIQUE ) ( Cost = 10 Card = 492 )
5 1 PARTITION HASH ( ALL )
6 5 TABLE ACCESS ( FULL ) OF ' tab_2 ' ( Cost = 20251 Card = 7
199100 Bytes = 194375700 )
Statistics
-------------------------------------------------------- --
0 recursive calls
0 db block gets
400061 consistent gets
308119 physical reads
560 redo size
20909078 bytes sent via SQL * Net to client
461556 bytes received via SQL * Net from client
41902 SQL * Net roundtrips to / from client
0 sorts ( memory )
0 sorts ( disk )
628514 rows processed
SQL >
a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ; 2
628514 rows selected .
Elapsed : 00 : 04 : 21.90
Execution Plan
-------------------------------------------------------- --
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20594 Card=369 Bytes
= 23985 )
1 0 HASH JOIN ( Cost = 20594 Card = 369 Bytes = 23985 )
2 1 PARTITION HASH ( ALL )
3 2 TABLE ACCESS ( BY LOCAL INDEX ROWID ) OF ' tab_1 '
( Cost = 242 Card = 369 Bytes = 14022 )
4 3 INDEX ( RANGE SCAN ) OF ' IDX_tab_1_SERVID ' ( NON
- UNIQUE ) ( Cost = 10 Card = 492 )
5 1 PARTITION HASH ( ALL )
6 5 TABLE ACCESS ( FULL ) OF ' tab_2 ' ( Cost = 20251 Card = 7
199100 Bytes = 194375700 )
Statistics
-------------------------------------------------------- --
0 recursive calls
0 db block gets
400061 consistent gets
308119 physical reads
560 redo size
20909078 bytes sent via SQL * Net to client
461556 bytes received via SQL * Net from client
41902 SQL * Net roundtrips to / from client
0 sorts ( memory )
0 sorts ( disk )
628514 rows processed
SQL >
呀,走hash连接的话,只要4分多钟就能跑出来,且consistent gets也仅仅只有40万,物理读也小了不少。看来oracle确实没有选择合适的执行计划了。
可oracle为什么没有选择合适的执行计划?难道是统计信息的问题?
SQL
>
select
TABLE_NAME
,
LAST_ANALYZED
from
user_tables
where
table_name
in
(
'
tab_2
'
,
'
tab_1
'
)
;
TABLE_NAME LAST_ANAL
---------------------------- -- ---------
tab_2 29 - JAN - 08
tab_1 11 - JUN - 08
TABLE_NAME LAST_ANAL
---------------------------- -- ---------
tab_2 29 - JAN - 08
tab_1 11 - JUN - 08
难道是tab_2的统计信息太久了导致执行计划出错?尝试备份统计信息并重新分析tab_2之后,再次查看执行计划 :
SQL
>
exec
dbms_stats
.
gather_table_stats
(
OWNNAME
=>
'
MISC
'
,
tabname
=>
'
tab_2
'
,
cascade
=>
TRUE
,
estimate_percent
=>
50
)
;
PL / SQL procedure successfully completed .
Elapsed : 00 : 18 : 37.95
SQL >
SQL >
SQL >
SQL > explain plan for
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.00
SQL > select * from table ( dbms_xplan . display ) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 369 | 23985 | 980 | | |
| 1 | NESTED LOOPS | | 369 | 23985 | 980 | | |
| 2 | PARTITION HASH ALL | | | | | 1 | 4 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_1 | 369 | 14022 | 242 | 1 | 4 |
|* 4 | INDEX RANGE SCAN | IDX_tab_1_SERVID | 492 | | 10 | 1 | 4 |
| 5 | PARTITION HASH ITERATOR | | | | | KEY | KEY |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_2 | 1 | 27 | 2 | KEY | KEY |
|* 7 | INDEX UNIQUE SCAN | PK_tab_2_MID | 1 | | 1 | KEY | KEY |
-------------------------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
3 - filter ( " A " . " STATUS " <> ' C ' )
4 - access ( " A " . " SERVID " = ' 025001003681 ' )
7 - access ( " A " . " MID " = " B " . " MID " )
Note : cpu costing is off
22 rows selected .
Elapsed : 00 : 00 : 00.02
PL / SQL procedure successfully completed .
Elapsed : 00 : 18 : 37.95
SQL >
SQL >
SQL >
SQL > explain plan for
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.00
SQL > select * from table ( dbms_xplan . display ) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 369 | 23985 | 980 | | |
| 1 | NESTED LOOPS | | 369 | 23985 | 980 | | |
| 2 | PARTITION HASH ALL | | | | | 1 | 4 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_1 | 369 | 14022 | 242 | 1 | 4 |
|* 4 | INDEX RANGE SCAN | IDX_tab_1_SERVID | 492 | | 10 | 1 | 4 |
| 5 | PARTITION HASH ITERATOR | | | | | KEY | KEY |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID | tab_2 | 1 | 27 | 2 | KEY | KEY |
|* 7 | INDEX UNIQUE SCAN | PK_tab_2_MID | 1 | | 1 | KEY | KEY |
-------------------------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
3 - filter ( " A " . " STATUS " <> ' C ' )
4 - access ( " A " . " SERVID " = ' 025001003681 ' )
7 - access ( " A " . " MID " = " B " . " MID " )
Note : cpu costing is off
22 rows selected .
Elapsed : 00 : 00 : 00.02
天,竟然还是走nl!!
难道是因为servid中的倾斜度的问题?再次查了一下tab_1表中servid=’025001003681′
SQL
>
select
count
(
*
)
from
tab_1
a
where
a
.
servid
=
'
025001003681
'
2 /
COUNT ( * )
-------- --
1564299
#### 而其他的servid的值:
SQL > select servid from tab_1 where rownum <= 10 ;
SERVID
---------- --
010001000983
010001001058
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
10 rows selected .
Elapsed : 00 : 00 : 00.02
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001001327 ' ;
COUNT ( * )
-------- --
33
Elapsed : 00 : 00 : 00.06
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001000983 ' ;
COUNT ( * )
-------- --
2
Elapsed : 00 : 00 : 00.00
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001000983 ' ;
COUNT ( * )
-------- --
2
2 /
COUNT ( * )
-------- --
1564299
#### 而其他的servid的值:
SQL > select servid from tab_1 where rownum <= 10 ;
SERVID
---------- --
010001000983
010001001058
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
010001001327
10 rows selected .
Elapsed : 00 : 00 : 00.02
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001001327 ' ;
COUNT ( * )
-------- --
33
Elapsed : 00 : 00 : 00.06
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001000983 ' ;
COUNT ( * )
-------- --
2
Elapsed : 00 : 00 : 00.00
SQL > select count ( * ) from tab_1 a where a . servid = ' 010001000983 ' ;
COUNT ( * )
-------- --
2
可以看到其他值的返回数据量确实很小,在返回数据量小的情况下,我们走索引确实是一种高效的查询方式,但是当返回数据很多时,由于nl的比较要花大量的时间,因此时间也就基本消耗在这里了。
对于某一列中的不同数值有不同的倾斜度(skew),要选择不同的执行计划,我们需要在收集直方图:
SQL
>
exec
dbms_stats
.
gather_table_stats
(
OWNNAME
=>
'
MISC
'
,
tabname
=>
'
tab_1
'
,
degree
=>
4
,
cascade
=>
TRUE
,
estimate_percent
=>
50
,
method_opt
=>
'
FOR COLUMNS servid SIZE auto
'
)
;
PL / SQL procedure successfully completed .
Elapsed : 00 : 12 : 48.22
PL / SQL procedure successfully completed .
Elapsed : 00 : 12 : 48.22
此时,我们再次检查其执行计划的时候,看到已经走了hash连接了:
扫描二维码关注公众号,回复:
1138279 查看本文章
SQL
>
explain
plan
for
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.01
SQL > select * from table ( dbms_xplan . display ) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 1172K| 72M| | 39071 | | |
|* 1 | HASH JOIN | | 1172 K | 72 M | 55 M | 39071 | | |
| 2 | PARTITION HASH ALL | | | | | | 1 | 4 |
|* 3 | TABLE ACCESS FULL | tab_1 | 1172 K | 42 M | | 13663 | 1 | 4 |
| 4 | PARTITION HASH ALL | | | | | | 1 | 4 |
| 5 | TABLE ACCESS FULL | tab_2 | 7199 K | 185 M | | 20251 | 1 | 4 |
-------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
1 - access ( " A " . " MID " = " B " . " MID " )
3 - filter ( " A " . " SERVID " = ' 025001003681 ' AND " A " . " STATUS " <> ' C ' )
Note : cpu costing is off
19 rows selected .
Elapsed : 00 : 00 : 00.02
SQL >
SQL >
2 select a . startdate , b . subsid from tab_1 a , tab_2 b where
3 a . servid = ' 025001003681 ' and a . status != ' C ' and a . mid = b . mid ;
Explained .
Elapsed : 00 : 00 : 00.01
SQL > select * from table ( dbms_xplan . display ) ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ --
-------------------------------------------------------------------------------------------- --
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost | Pstart| Pstop |
-------------------------------------------------------------------------------------------- --
| 0 | SELECT STATEMENT | | 1172K| 72M| | 39071 | | |
|* 1 | HASH JOIN | | 1172 K | 72 M | 55 M | 39071 | | |
| 2 | PARTITION HASH ALL | | | | | | 1 | 4 |
|* 3 | TABLE ACCESS FULL | tab_1 | 1172 K | 42 M | | 13663 | 1 | 4 |
| 4 | PARTITION HASH ALL | | | | | | 1 | 4 |
| 5 | TABLE ACCESS FULL | tab_2 | 7199 K | 185 M | | 20251 | 1 | 4 |
-------------------------------------------------------------------------------------------- --
Predicate Information ( identified by operation id ) :
------------------------------------------------- --
1 - access ( " A " . " MID " = " B " . " MID " )
3 - filter ( " A " . " SERVID " = ' 025001003681 ' AND " A " . " STATUS " <> ' C ' )
Note : cpu costing is off
19 rows selected .
Elapsed : 00 : 00 : 00.02
SQL >
SQL >
这个时候,我们不加hints,oracle已经正确的选择了执行计划。至此,不走hash join是因为索引列的倾斜度的问题,收集直方图后,问题搞定。