Hive - 动态分区操作

参考资料:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables

相关参数:

set hive.exec.dynamic.partition=true; (默认false)# 开启动态分区
set hive.exec.dynamic.partition.mode=nonstrict; (默认strict)# 在strict模式下,用户必须至少指定一个静态分区,以防用户意外覆盖所有分区;在nonstrict模式下,允许所有分区都是动态的
set hive.exec.max.dynamic.partitions.pernode=100; # 每个映射器/缩减器节点中允许创建的最大动态分区数
set hive.exec.max.dynamic.partitions=1000; # 总共允许创建的最大动态分区数
set hive.exec.max.created.files=100000; # MapReduce作业中所有映射器/还原器创建的HDFS文件的最大数量
set hive.error.on.empty.partition=false; # 如果动态分区插入生成空结果,是否引发异常

demo:

drop table if exists ${table_name};
create table if not exists ${table_name}(
order_no string,
zip string,
mobile string,
total_amt string)
partitioned by (
biz_date string,
etl_date string)
stored as parquet;

set mapreduce.job.queuename=queue_name;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=100000;
set hive.exec.max.dynamic.partitions=100000;
set hive.exec.max.created.files=100000;
insert into table ${table_name} partition(biz_date,etl_date)
select 
order_no,
zip,
mobile,
total_amt,
biz_date,
etl_date
from ${db_name}.${table_name};

猜你喜欢

转载自blog.csdn.net/qq_24256877/article/details/106501104