安装 Hive2.3
1)1)上传 apache-hive-2.3.0-bin.tar.gz 到/export/software 目录下,并解压到/export/servers
tar-zxvf apache-hive-2.3.6-bin.tar.gz -C /export/servers/
2)修改 apache-hive-2.3.6-bin 名称为 hive
mv apache-hive-2.3.6-bin hive
3)将 Mysql 的 mysql-connector-java-5.1.27-bin.jar到/export/servers/hive/lib/
cp /export/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /export/servers/hive/lib/
4)在/export/servers/hive/conf 路径上,创建 hive-site.xml 文件
vim hive-site.xml
添加如下内容
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop12:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>000000</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop12:9083</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
</configuration>
5)服务启动完毕后在启动 Hive
bin/hive
Hive 集成引擎 Tez
1)下载 tez 的依赖包:http://tez.apache.org
2)拷贝 apache-tez-0.9.1-bin.tar.gz 到 hadoop102 的/export/software 目录
3)将 apache-tez-0.9.1-bin.tar.gz 上传到 HDFS 的/tez 目录下
hadoop fs -mkdir /tez
tar -zxvf apache-tez-0.9.1-bin.tar.gz -C /export/servers
5)修改名称
mv apache-tez-0.9.1-bin/ tez-0.9.1
集成 Tez
1)进入到 Hive 的配置目录:/export/servers/hive/conf
2)在 Hive 的/export/servers/hive/conf 下面创建一个 tez-site.xml 文件
vim tez-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/tez/apache-tez-0.9.1-bin.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
2)在 hive-env.sh 文件中添加 tez 环境变量配置和依赖包环境变量配置
mv hive-env.sh.template hive-env.sh
vim hive-env.sh
在末尾添加如下配置
export TEZ_HOME=/export/servers/tez-0.9.1 #是你的tez的解压目录
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
export HIVE_AUX_JARS_PATH=/export/servers/hadoop-2.7.7/share/hadoop/common/hadoop-lzo-0.4.20.jar$TEZ_JARS
3)在 hive-site.xml 文件中添加如下配置,更改 hive 计算引擎
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
测试
1)启动 Hive
bin/hive
2)创建表
create table student(
id int,
name string);
3)向表中插入数据
insert into student values(1,"zhangsan");
4)如果没有报错就表示成功了
hive (default)> select * from student;
1 zhangsan
注意事项
1)运行 Tez 时检查到用过多内存而被 NodeManager 杀死进程问题:
Caused by: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1546781144082_0005 failed 2 times due to AM Container for appattempt_1546781144082_0005_000002 exited with exitCode: -103
For more detailed output, check application tracking page:http://hadoop102:8088/cluster/app/application_1546781144082_0005Then, click on links to logs of each attempt.
Diagnostics: Container [pid=11116,containerID=container_1546781144082_0005_02_000001] is running beyond virtual memory limits. Current usage: 216.3 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
这种问题是从机上运行的 Container 试图使用过多的内存,而被 NodeManager kill 掉了
2)解决方法:
(1)关掉虚拟内存检查,修改 yarn-site.xml
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
(2)修改后一定要分发,并重新启动 hadoop 集群。
xsync yarn-site.xml