6.hive参数配置方法与动态分区

6.hive参数与动态分区

6.1 Hive参数

6.1.1 命名规则

hive当中的参数、变量,都是以命名空间开头
在这里插入图片描述
通过${}方式进行引用,其中system、env下的变量必须以前缀开头。

6.1.2 hive 参数设置方式

  1. 修改配置文件 ${HIVE_HOME}/conf/hive-site.xml
  2. 启动hive cli时,通过–hiveconf key=value的方式进行设置
hive --hiveconf hive.cli.print.header=true # 查询时显示表头

hive> select * from wc_count;
OK
wc_count.word	wc_count.count
andy	3
hello	5
joy	3
mark	1
rose	2
tom	2
Time taken: 3.631 seconds, Fetched: 6 row(s)

这种方式设置的参数的有效期是直到连接断开。

  1. 进入cli之后,通过使用set命令设置
    注意:2和3两种方式设置的参数只在当前会话有效。

6.1.3 hive set命令

hive> [root@node4 ~]# hive
hive> select word,count from wc_count;
OK
andy 3
hello 5
joy 3
mark 1
rose 2
tom 2
Time taken: 3.562 seconds, Fetched: 6 row(s)

在hive cli控制台上通过set命令对hive中的参数进行查询或设置
set查询:

hive> set hive.cli.print.header;
hive.cli.print.header=false
hive> set;
#查询所有的参数

set设置参数

hive> set hive.cli.print.header=true;
hive> set hive.cli.print.header;
hive.cli.print.header=true

hive的历史操作的命令:

[root@node4 ~]# ll -a
-rw-r--r--   1 root root 11590 11月 19 11:07
.hivehistory

在当前用户的家目录/root下有一个.hivehistory文件,该文件记录了执行的hive命令:

[root@node4 ~]# vim .hivehistory
create table psn(id int,age int);
insert into psn values(1,18);
create database hivedb1;
......
select * from wc_count;
select word,count from wc_count;

hive参数的初始化配置:
在当前用户的家目录的.hiverc文件,如果没有可以创建一个,添加参数的配置:

[root@node4 ~]# vim .hiverc
set hive.cli.print.header=true;

将hive客户端的连接断开,重新连接,参数就生效了,也将一直对当前用户有效:

[root@node4 ~]# hive
hive> select id,name,likes from person;
OK
id name likes
1 小明1 ["lol","book","movie"]
2 小明2 ["lol","book","movie"]
3 小明3 ["lol","book","movie"]
4 小明4 ["lol","book","movie"]
5 小明5 ["lol","movie"]
6 小明6 ["lol","book","movie"]
7 小明7 ["lol","book"]
8 小明8 ["lol","book"]

6.2 动态分区

开启支持动态分区

set hive.exec.dynamic.partition=true;

默认:true #默认支持动态分区

hive> set hive.exec.dynamic.partition.mode;
hive.exec.dynamic.partition.mode=strict
#修改为非严格模式
hive>set
hive.exec.dynamic.partition.mode=nostrict;

默认:strict严格模式(比如订单表以秒为单位创建分区,将会导致特别多的分区,严格模式一般不允许,但是非严格模式允许)。
nostrict:非严格模式
案例演示:
创建原始数据表

create table person21(
id int,
name string,
age int,
gender string,
likes array<string>,
address map<string,string>
)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':';

将软件/data/person21.txt上传到node4上的/root/data目录下加载数据

hive> load data local inpath
'/root/data/person21.txt' into table person21;
Loading data to table default.person21
OK
Time taken: 0.786 seconds
hive> select * from person21;
OK
person21.id person21.name person21.age 
person21.gender person21.likes 
person21.address
1 tuhao1 32 man ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
2 tuhao2 32 man ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
3 tuhao3 12 boy ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
4 tuhao4 32 man ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
5 tuhao5 12 boy ["lol","movie"]
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
6 tuhao6 32 man ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
7 tuhao7 32 man ["lol","book"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
8 tuhao8 12 boy ["lol","book"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}
9 tuhao9 32 man ["lol","book","movie"] 
{
    
    "beijing":"xisanqi","shanghai":"pudong"}

创建分区表:

create table person22(
id int,
name string,
likes array<string>,
address map<string,string>
)
partitioned by(age int,gender string)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':';

将person21中的数据加载到person22表中:

from person21
insert overwrite table person22 partition(age,gender)
select id,name,likes,address,age,gender distribute by age,gender

简写为

from person21
insert overwrite table person22 partition(age,gender)
select id,name,likes,address,age,gender;

注意:分区字段一定要写到最后面,否则容易出错。
遇到报错:FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict
解决方法:

hive> set hive.exec.dynamic.partition=true;
hive> set hive.exec.dynamic.partition.mode=nonstrict;
show partitions person22;#查询某表上的 分区

查看表的分区:

hive> show partitions person22;
OK
partition
age=12/gender=boy
age=32/gender=man
Time taken: 0.331 seconds, Fetched: 2 row(s)

相关参数

set hive.exec.max.dynamic.partitions.pernode;
每一个执行mr节点上,允许创建的动态分区的最大数量(100)
set hive.exec.max.dynamic.partitions;
所有执行mr节点上,允许创建的所有动态分区的最大数量
(1000)
set hive.exec.max.created.files;
所有的mr job允许创建的文件的最大数量(100000)

猜你喜欢

转载自blog.csdn.net/m0_63953077/article/details/130473041