Hive-2.x+Tez的安装

安装 Hive2.3
1)1)上传 apache-hive-2.3.0-bin.tar.gz 到/export/software 目录下,并解压到/export/servers

tar-zxvf apache-hive-2.3.6-bin.tar.gz -C /export/servers/

2)修改 apache-hive-2.3.6-bin 名称为 hive

mv apache-hive-2.3.6-bin hive

3)将 Mysql 的 mysql-connector-java-5.1.27-bin.jar到/export/servers/hive/lib/

cp /export/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /export/servers/hive/lib/

4)在/export/servers/hive/conf 路径上,创建 hive-site.xml 文件

vim hive-site.xml

添加如下内容

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop12:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>000000</value>
<description>password to use against metastore database</description>
</property>

<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
	<name>hive.cli.print.header</name>
	<value>true</value>
</property>

<property>
	<name>hive.cli.print.current.db</name>
	<value>true</value>
</property>

<property> 
<name>hive.metastore.schema.verification</name> 
<value>false</value> 
</property> 
<property> 
<name>datanucleus.schema.autoCreateAll</name> 
<value>true</value> 
</property> 
<property>
  <name>hive.metastore.uris</name>
  <value>thrift://hadoop12:9083</value>
</property>

<property>
    <name>hive.execution.engine</name>
    <value>tez</value>
</property>

</configuration>

5)服务启动完毕后在启动 Hive

bin/hive

Hive 集成引擎 Tez
1)下载 tez 的依赖包:http://tez.apache.org
2)拷贝 apache-tez-0.9.1-bin.tar.gz 到 hadoop102 的/export/software 目录
3)将 apache-tez-0.9.1-bin.tar.gz 上传到 HDFS 的/tez 目录下

hadoop fs -mkdir /tez
tar -zxvf apache-tez-0.9.1-bin.tar.gz -C /export/servers

5)修改名称

mv apache-tez-0.9.1-bin/ tez-0.9.1

集成 Tez
1)进入到 Hive 的配置目录:/export/servers/hive/conf
2)在 Hive 的/export/servers/hive/conf 下面创建一个 tez-site.xml 文件

vim tez-site.xml
<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 <configuration>
<property> 
<name>tez.lib.uris</name> 
<value>${fs.defaultFS}/tez/apache-tez-0.9.1-bin.tar.gz</value> 
</property> 
<property> 
<name>tez.use.cluster.hadoop-libs</name> 
<value>true</value> 
</property> 
<property> 
<name>tez.history.logging.service.class</name> 
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value> 
</property>
</configuration>

2)在 hive-env.sh 文件中添加 tez 环境变量配置和依赖包环境变量配置

mv hive-env.sh.template hive-env.sh
vim hive-env.sh

在末尾添加如下配置

export TEZ_HOME=/export/servers/tez-0.9.1    #是你的tez的解压目录
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
    export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done

export HIVE_AUX_JARS_PATH=/export/servers/hadoop-2.7.7/share/hadoop/common/hadoop-lzo-0.4.20.jar$TEZ_JARS

3)在 hive-site.xml 文件中添加如下配置,更改 hive 计算引擎

<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>

测试
1)启动 Hive

bin/hive

2)创建表

create table student(
id int,
name string);

3)向表中插入数据

insert into student values(1,"zhangsan");

4)如果没有报错就表示成功了

hive (default)> select * from student;
1      zhangsan

注意事项
1)运行 Tez 时检查到用过多内存而被 NodeManager 杀死进程问题:

Caused by: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1546781144082_0005 failed 2 times due to AM Container for appattempt_1546781144082_0005_000002 exited with  exitCode: -103
For more detailed output, check application tracking page:http://hadoop102:8088/cluster/app/application_1546781144082_0005Then, click on links to logs of each attempt.
Diagnostics: Container [pid=11116,containerID=container_1546781144082_0005_02_000001] is running beyond virtual memory limits. Current usage: 216.3 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.

这种问题是从机上运行的 Container 试图使用过多的内存,而被 NodeManager kill 掉了
2)解决方法:
(1)关掉虚拟内存检查,修改 yarn-site.xml

<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>

(2)修改后一定要分发,并重新启动 hadoop 集群。

xsync yarn-site.xml

猜你喜欢

转载自blog.csdn.net/qq_46548855/article/details/107702949