我们在优化SQL的时候常常用到的手段就是加索引,但是随便增加索引可能会带来一系列问题:写入IO放大,占用更多空间,写入性能下降。并且,在加索引时,会堵塞DML。虽然PG支持并发加索引,不堵塞DML。
因此我们要怎么判断该不该加索引呢?
虚拟索引是一个很有用的东西,没有副作用,只是虚拟的索引,建立虚拟索引后,可以通过EXPLAIN来查看加索引后的成本估算,判断是否加索引COST会降低。
oracle中支持虚拟索引, 虚拟索引(Virtual Indexes)是一个定义在数据字典中的假索引(fake index),它没有相关的索引段。
例子:
oracle:
SQL> create table test
2 as
3 select * from dba_objects;
Table created.
SQL> select * from test where object_id=60;
Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 12025 | 619 (1)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 25 | 12025 | 619 (1)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=60)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
直接查询不走索引,我们加上虚拟索引试试。
SQL> alter session set "_USE_NOSEGMENT_INDEXES"=true;
Session altered.
SQL> set autotrace off;
SQL> create index idx_test_virtual on test(object_id) nosegment;
Index created.
SQL> set autotrace traceonly explain;
SQL> select * from test where object_id=60;
Execution Plan
----------------------------------------------------------
Plan hash value: 1226113969
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 12025 | 5 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| TEST | 25 | 12025 | 5 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TEST_VIRTUAL | 464 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=60)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
可以看到执行计划可以走索引,但是实际的执行计划还是走全表扫描,如下:
SQL> alter session set statistics_level=all ;
Session altered.
SQL> select * from test where object_id=60;
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
SQL_ID 6t76zuzdgc4d9, child number 1
-------------------------------------
select * from test where object_id=60
Plan hash value: 1357081020
------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2301 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 25 | 1 |00:00:00.01 | 2301 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=60)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
22 rows selected.
PostgreSQL:
那么在pg中该如何实现oracle中虚拟索引的功能呢?pg中虚拟索引需要借助hypopg插件:
1、安装插件
https://github.com/HypoPG/hypopg
2、建立插件
bill@bill=>create extension hypopg ;
CREATE EXTENSION
bill@bill=>\dx+ hypopg
Objects in extension "hypopg"
Object description
------------------------------------
function hypopg()
function hypopg_create_index(text)
function hypopg_drop_index(oid)
function hypopg_get_indexdef(oid)
function hypopg_list_indexes()
function hypopg_relation_size(oid)
function hypopg_reset()
function hypopg_reset_index()
(8 rows)
3、建测试表
bill@bill=>CREATE TABLE hypo AS SELECT id, 'line ' || id AS val FROM generate_series(1,10000) id;
SELECT 10000
4、查看没有索引时,全表扫描的执行计划
bill@bill=>EXPLAIN SELECT * FROM hypo WHERE id = 1;
QUERY PLAN
--------------------------------------------------------
Seq Scan on hypo (cost=0.00..142.31 rows=35 width=36)
Filter: (id = 1)
(2 rows)
5、建立虚拟索引
bill@bill=>SELECT * FROM hypopg_create_index('CREATE INDEX ON hypo (id)');
indexrelid | indexname
------------+-----------------------
143351 | <143351>btree_hypo_id
(1 row)
6、查看已建立了哪些虚拟索引
bill@bill=>SELECT * FROM hypopg_list_indexes();
indexrelid | indexname | nspname | relname | amname
------------+-----------------------+---------+---------+--------
143351 | <143351>btree_hypo_id | bill | hypo | btree
(1 row)
7、查看建立虚拟索引后的执行计划
可以走索引扫描。
bill@bill=>EXPLAIN SELECT * FROM hypo WHERE id = 1;
QUERY PLAN
-----------------------------------------------------------------------------------
Index Scan using <143351>btree_hypo_id on hypo (cost=0.04..2.65 rows=1 width=13)
Index Cond: (id = 1)
(2 rows)
8、查看真实的执行计划
可以发现实际还是全表扫描。
bill@bill=>EXPLAIN ANALYZE SELECT * FROM hypo WHERE id = 1;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Seq Scan on hypo (cost=0.00..180.00 rows=1 width=13) (actual time=0.016..0.849 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 9999
Planning Time: 0.054 ms
Execution Time: 0.866 ms
(5 rows)
9、删除虚拟索引
调用hypopg_drop_index(indexrelid) 清除单个虚拟索引。
bill@bill=>select hypopg_drop_index(143351);
hypopg_drop_index
-------------------
t
(1 row)
也可以使用hypopg_reset() 清除所有虚拟索引。