HiveQL DDL—索引

概述

  Hive的索引功能是在0.7版本引入的,从3.0开始该特性被移除(参考HIVE-18448.)。不过3.0引入了物化视图这一类似索引的技术。Hive使用索引是为了提高查询表中某些列的速度。如果没有索引,使用诸如’WHERE tab1.col1 = 10’这样的查询将会加载并处理整个表或分区中的记录。此时如果 col1 存在索引,就只需要加载和处理文件的一部分。这和使用列式存储格式(Parquet, ORC)有着相同的逻辑。

创建索引

CREATE INDEX index_name
ON TABLE base_table_name (col_name, ...)
AS 'index.handler.class.name'
[WITH DEFERRED REBUILD]
[IDXPROPERTIES (property_name=property_value, ...)]
[IN TABLE index_table_name]
[PARTITIONED BY (col_name, ...)]
[
   [ ROW FORMAT ...] STORED AS ...
   | STORED BY ...
]
[LOCATION hdfs_path]
[TBLPROPERTIES (...)]
[COMMENT "index comment"]

  示例

> CREATE INDEX test_index ON TABLE test_hive (name) AS 'COMPACT' WITH DEFERRED REBUILD;
--创建索引使用RCFile文件格式
> CREATE INDEX test_index2 ON TABLE test_hive (name) AS 'COMPACT' WITH DEFERRED REBUILD  STORED AS RCFILE;

查看索引

SHOW [FORMATTED] (INDEX|INDEXES) ON table_with_index [(FROM|IN) db_name];

  示例

> SHOW INDEX ON test_hive;
+-----------------------+-----------------------+-----------------------+-----------------------------------+-----------------------+----------+--+
|       idx_name        |       tab_name        |       col_names       |           idx_tab_name            |       idx_type        | comment  |
+-----------------------+-----------------------+-----------------------+-----------------------------------+-----------------------+----------+--+
| test_index            | test_hive             | name                  | default__test_hive_test_index__   | compact               |          |
| test_index2           | test_hive             | name                  | default__test_hive_test_index2__  | compact               |          |
+-----------------------+-----------------------+-----------------------+-----------------------------------+-----------------------+----------+--+

修改索引

ALTER INDEX index_name ON table_name [PARTITION partition_spec] REBUILD;

  ALTER INDEX … REBUILD会生成使用WITH DEFERRED REBUILD子句创建的索引,或者重建以前创建的索引。如果指定了分区,则仅重建该分区。

  示例

> ALTER INDEX  test_index ON test_hive REBUILD;

删除索引

DROP INDEX [IF EXISTS] index_name ON table_name;

  示例

> DROP INDEX test_index2 ON test_hive;
> SHOW INDEX ON test_index;
+-----------------------+-----------------------+-----------------------+----------------------------------+-----------------------+----------+--+
|       idx_name        |       tab_name        |       col_names       |           idx_tab_name           |       idx_type        | comment  |
+-----------------------+-----------------------+-----------------------+----------------------------------+-----------------------+----------+--+
| test_index            | test_hive             | name                  | default__test_hive_test_index__  | compact               |          |
+-----------------------+-----------------------+-----------------------+----------------------------------+-----------------------+----------+--+

参考
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/AlterIndex
https://cwiki.apache.org/confluence/display/Hive/IndexDev
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing

发布了57 篇原创文章 · 获赞 3 · 访问量 1万+

猜你喜欢

转载自blog.csdn.net/CPP_MAYIBO/article/details/102152487
今日推荐