How to install pseudo-distributed hive3 version in CentOS6.5 environment

Table of contents

1. Installation instructions

2. Configure metadata management database on MySQL side

3. Prepare hive environment

1. Extract the hive installation package to /home/hadoop. After the decompression is complete, rename the folder to hive3.

 2. Prepare the driver package for the corresponding version of MySQL, and copy the Java driver to the jar package folder of hive (lib)

1) Check the MySQL version

2) Upload the driver’s hive jar package library

3) Inspection and viewing

4) Since the versions of the guava.jar package in hive and the guava.jar package in hadoop are inconsistent, it will cause a version conflict. Here, the higher version will be replaced with the lower version.

5) Configure environment variables (because my environment is basically user-level environment variables ~/.bash_profile, if you are a friend of system environment variables, then configure /etc/profile)

4. Configure hive files

1. Enter the configuration file directory under the hive installation directory, and then modify the configuration file (cd /home/hadoop/software/hive3.1.2/conf)

2. Then configure hive-env.sh and hive-default.xml in the changed directory

3. Configure hive-env.sh, source it and refresh it after modification.

4. Configure hive-default.xml file

5. Test installation

1. Start our Hadoop cluster

2. Initialize the metadata database (schematool -dbType mysql -initSchema)

3. Log in to hive, nice, done!

4. Log in to beeline


1. Installation instructions

        The premise is that there is a corresponding Hadoop distributed environment (since I recently need to use pseudo-distributed, my demonstration environment is a pseudo-distributed environment), and the version must be compatible. My version of Hadoop is 3, so choose here The version is also 3, which must be paid attention to in the big data environment. There must also be a metadata management host. Here we use MySQL to manage our hive metadata. Related links:

Hadoop distributed installation

MySQL installation

2. Configure metadata management database on MySQL side

1. Create a MySQL user, here is myhive

在 MySQL 5 中,可以使用以下步骤创建用户:

1. 连接到 MySQL 数据库服务器,可以使用如下命令进行连接:
   ```
   mysql -u root -p
   ```
   这将要求你输入 MySQL 的 root 用户的密码。

2. 执行以下 SQL 语句来创建一个新用户:
   ```
   CREATE USER 'username'@'localhost' IDENTIFIED BY 'password';
   ```
   将 `username` 替换为你要创建的用户名,将 `password` 替换为用户的密码。`localhost` 表示该用户只能从本地连接,如果要允许从远程连接,请将 `localhost` 更改为允许的主机名或 IP 地址。

3. 授予用户适当的权限。例如,如果你想将用户授予对特定数据库的全部权限,可以使用以下命令:
   ```
   GRANT ALL PRIVILEGES ON database_name.* TO 'username'@'localhost';
   ```
   将 `database_name` 替换为你要授予权限的数据库名。

4. 最后,使用以下命令刷新权限以使更改生效:
   ```
   FLUSH PRIVILEGES;
   ```



在 MySQL 8 版本中,创建用户并为其远程赋权的步骤如下:

1. 连接到 MySQL 数据库服务器,可以使用以下命令进行连接:
   ```
   mysql -u root -p
   ```
   这将要求你输入 MySQL 的 root 用户的密码。

2. 执行以下 SQL 语句来创建一个新用户:
   ```
   CREATE USER 'username'@'%' IDENTIFIED BY 'password';
   ```
   将 `username` 替换为你要创建的用户名,将 `password` 替换为用户的密码。`%` 表示该用户可以从任何主机连接。

3. 授予用户适当的权限。例如,如果你想将用户授予对所有数据库的全部权限,可以使用以下命令:
   ```
   GRANT ALL PRIVILEGES ON *.* TO 'username'@'%';
   ```
   如果你只想给予用户对特定数据库的权限,可以使用以下命令:
   ```
   GRANT ALL PRIVILEGES ON `database_name`.* TO 'username'@'%';
   ```
   将 `database_name` 替换为你要授予权限的数据库名。

4. 最后,使用以下命令刷新权限以使更改生效:
   ```
   FLUSH PRIVILEGES;
   ```

请注意,默认情况下,MySQL 8 版本的新安装不允许远程连接,你可能需要进行一些额外的配置来确保你的 MySQL 服务器可以接受远程连接。如果有需要,请参考 MySQL 官方文档或者基于你使用的操作系统的相关文档进行配置。


在 MySQL 8 版本中,要允许远程连接,需要执行以下步骤:

1. 连接到 MySQL 数据库服务器,可以使用以下命令进行连接:
   ```
   mysql -u root -p
   ```
   这将要求你输入 MySQL 的 root 用户的密码。

2. 运行以下 SQL 语句以创建一个新用户并授予其远程连接权限:
   ```
   CREATE USER 'username'@'%' IDENTIFIED BY 'password';
   GRANT ALL PRIVILEGES ON *.* TO 'username'@'%' WITH GRANT OPTION;
   ```
   将 `'username'` 替换为你想要创建的用户名,将 `'password'` 替换为用户的密码。

3. 在配置文件中更新绑定地址(如果需要):
   - 打开 MySQL 配置文件,在 Linux 上通常是 `/etc/mysql/mysql.conf.d/mysqld.cnf`,在 Windows 上通常是 `C:\ProgramData\MySQL\MySQL Server 8.0\my.ini`。
   - 查找 `bind-address` 参数,并将其设置为数据库服务器所在的 IP 地址或 `0.0.0.0` 表示允许所有 IP 连接。例如:`bind-address = 0.0.0.0`。
   - 保存并关闭配置文件。

4. 重启 MySQL 服务器以使配置更改生效。在 Linux 上可以使用以下命令进行重启:
   ```
   sudo service mysql restart
   ```
   在 Windows 上可以通过服务管理器重启 `MySQL80` 服务。

完成以上步骤后,你已经成功允许 MySQL 8 版本的远程连接。请确保通过防火墙配置允许从远程主机访问 MySQL 服务器的端口(默认为 3306)。同时,为了安全起见,请注意控制用户的权限,并仔细审查授权。

如果你还遇到其他问题,请提供更多的具体信息,我将尽力帮助你。

The operation is as follows

Detection:

Create a database dedicated to storing metadata

Create database:

3. Prepare hive environment

1. Extract the hive installation package to /home/hadoop. After the decompression is complete, rename the folder to hive3.

Unzip: tar -zxvf compressed file package

Rename: mv old file name new file name

 2. Prepare the driver package for the corresponding version of MySQL, and copy the Java driver to the jar package folder of hive (lib)

You can upload it directly to the folder

1) Check the MySQL version

2) Upload the driver’s hive jar package library

3) Inspection and viewing

4) Since the versions of the guava.jar package in hive and the guava.jar package in hadoop are inconsistent, it will cause a version conflict. Here, the higher version will be replaced with the lower version.

Check the version on hive

Check the version of hadoop (/home/hadoop/software/hadoop-3.3.0/share/hadoop/common/lib)

Delete the lower version of hive side

Copy the hadoop side to the hive side (cp guava-27.0-jre.jar /home/hadoop/software/hive3/lib)

Check

5) Configure environment variables (because my environment is basically user-level environment variables ~/.bash_profile, if you are a friend of system environment variables, then configure /etc/profile)

Exit edit mode, then: After wq saves, refresh the environment variables: source ~/.bash_profile

4. Configure hive files

1. Enter the configuration file directory under the hive installation directory, and then modify the configuration file (cd /home/hadoop/software/hive3.1.2/conf)

2. Then configure hive-env.sh and hive-default.xml in the changed directory

3. Configure hive-env.sh, source it and refresh it after modification.

4. Configure hive-default.xml file

1) Configure the management method of metadata

2) Configure the mysql driver

Notice:

The setting for mysql 5 version is: com.mysql.jdbc.Driver

The setting for mysql 8 version is: com.mysql.cj.jdbc.Driver

3) Configure the username to connect to mysql

4) Configure the password to connect to mysql

5) Configure mysql login password

6) Configure hive mode

7) Configure whether to display the header information of the current table

8) Configure whether to display the current data name

Overall configuration file:

<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 
<configuration> 
    <!-- 选择当前的模式-->
    <property> 
        <name>hive.metastore.local</name> 
        <value>true</value> 
    </property> 

    <!-- 配置存储元数据mysql相关配置 -->
    <property> 
        <name>javax.jdo.option.ConnectionURL</name> 
         <value>jdbc:mysql://hadooptest:3306/hive_metadata?createDatabaseIfNotExist=true&amp;useSSL=false&amp;useUnicode=true&amp;characterEncoding=UTF-8</value> 
    </property> 

<!-- useSSL=false和true的区别:
       SSL(Secure Sockets Layer 安全套接字协议),在mysql进行连接的时候,如果mysql的版本是5.7之后的版本必须要加上useSSL=false,mysql5.7以及之前的版本则不用进行添加useSSL=false,会默认为false,一般情况下都是使用useSSL=false,尤其是在将项目部署到linux上时,一定要使用useSSL=false!!!,useSSL=true是进行安全验证,一般通过证书或者令牌什么的,useSSL=false就是通过账号密码进行连接,通常使用useSSL=false!!! -->
    
    <!-- 配置mysql的驱动 -->
    <property> 
        <name>javax.jdo.option.ConnectionDriverName</name> 
        <value>com.mysql.cj.jdbc.Driver</value> 
    </property> 

    <!-- 配置mysql用户名-->
    <property> 
        <name>javax.jdo.option.ConnectionUserName</name> 
        <value>myhive</value> 
    </property>

    <!-- 配置mysql登录的密码-->
    <property> 
        <name>javax.jdo.option.ConnectionPassword</name> 
        <value>123456</value> 
    </property>

    <!--配置hive的环境为本地模式-->
    <property>
        <name>hive.exec.mode.local.auto</name>
        <value>true</value>
    <description>Let Hive determine whether to run in local mode automatically</description>
    </property>

    <!--是否显示当前的表头-->
    <property>
	<name>hive.cli.print.header</name>
	<value>true</value>
    </property> 

    <!--是否显示当前的数据库名称-->
    <property>
	<name>hive.cli.print.current.db</name>
	<value>true</value>
    </property>
</configuration> 

5. Test installation

1. Start our Hadoop cluster

2. Initialize the metadata database (schematool -dbType mysql -initSchema)

初始化metadata
cd /usr/local/bigdata/apache-hive-3.1.2-bin
bin/schematool -initSchema -dbType mysql -verbos
#初始化是否成功验证标准:初始化成功会在mysql中创建74张表

Error reported:

Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8). According to the information in the quote, the way to solve this problem is to enter the hive-site.xml file, find the corresponding line number, and delete the special character "" in it.

Here I will annotate it ( how to quickly locate it )

Execute again!

result!

3. Log in to hive, nice, done!

4. Log in to beeline

Guess you like

Origin blog.csdn.net/qq_57492774/article/details/133091807