mac install Hadoop, mysql, hive tutorial

Before installing Hadoop, mysql, hive, we must first ensure jdk installed on your computer

A. Configuration jdk

1. Download jdk

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

2. Configure Environment Variables

(1) used in the terminal  sudo su root user enters a command mode;

(2) Use  vim / etc / profile command to open the profile file, press the uppercase "I" into the edit mode, add the following information in the file:

       JAVA_HOME corresponds to your JDK installation path

JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home"
CLASS_PATH="$JAVA_HOME/lib"
PATH=".;$PATH:$JAVA_HOME/bin"
export JAVA_HOME

(3) using the "esc" key to exit edit mode, press the ":", type wq and press Enter to save and exit.

  (4) to exit the terminal and re-opens, enter the jdk java -version command to view the configuration.

II. Configuration hadoop

1. Download Hadoop

Hadoop 2.7.7 Mirror Download Link: https: //mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz

2. Copy the zip file downloaded to / Users / finup / opt directory and extract

3. Configure Hadoop

(1) Access /Users/finup/opt/hadoop-2.7.7/etc/hadoop/ directory, modify hadoop-env.sh profile

First To view the installation path of the JAVA_HOME:

Enter the following command: / usr / libexec / JAVA_HOME

/ usr / libexec / java_home 
results: /Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home

Then modify hadoop-env.sh profile

export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home"
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="

(2) address and port configuration hdfs

Enter /Users/finup/opt/hadoop-2.7.7/etc/hadoop/ directory, modify the core-site.xml configuration file

<configuration>
	<property>
             <name>hadoop.tmp.dir</name>   
             <value>file:/Users/finup/opt/hadoop-2.7.7/tmp</value>
             <description>Abase for other temporary directories.</description>
        </property>
        <property>
             <name>fs.defaultFS</name>
             <value>hdfs://localhost:9000</value>
        </property>
</configuration>

hadoop.tmp.dir said he process produced some of the data to be placed in that directory

Fs.defaultFS set the default file system for Hadoop is set to "hdfs: // localhost: 9000". localhost represents namenade, 9000 represents the port number. HDFS daemon host and port determined by the HDFS NameNode attribute item.

(3). The default configuration parameters number of copies of HDFS

 Enter /Users/finup/opt/hadoop-2.7.7/etc/hadoop/ directory, modify hdfs-site.xml configuration file

<configuration>
        <property>
             <name>dfs.replication</name>   
             <value>1</value>
        </property>
        <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/Users/finup/opt/hadoop-2.7.7/tmp/dfs/name</value>
        </property>
        <property>
             <name>dfs.datanode.data.dir</name>
             <value>file:/Users/finup/opt/hadoop-2.7.7/tmp/dfs/data</value>
        </property>
</configuration>

 dfs.replication indicates the number of copies of the file will be saved as a fraction. dfs.replication the value is set to "1", so that it will not HDFS default settings copies of the file system block is set to 3. Otherwise runs on a single datanode time, HDFS can not copy blocks to the three datanode, it will continue to be given a copy of the block warning inadequate.

(4). Jobtracker configuration of the address and port mapreduce

Enter /Users/finup/opt/hadoop-2.7.7/etc/hadoop/ directory, modify mapred-site.xml.template profile

<configuration>
     <property>
        <name>mapred.job.tracker</name>
        <value>localhost:9001</value>
      </property>
</configuration>

(5). Modify Profile yarn-site.xml

Enter /Users/finup/opt/hadoop-2.7.7/etc/hadoop/ directory, modify yarn-site.xml configuration file  

<configuration>
           <property>
             <name>yarn.nodemanager.aux-services</name>
             <value>mapreduce_shuffle</value>
            </property>
</configuration>

(6) . File system initialization

Hadoop bin directory into the installation path, using the command  ./hadoop namenode -format initialized, the initialization is successful following output information, note that a red frame mark.

                                      

(7) Configure Hadoop environment variables

       The purpose is to facilitate in any directory hadoop related services globally open and close, without the need to /Users/finup/opt/hadoop-2.7.7/sbin下去执行启动或关闭命令。使用命令 vim ~/.zshrac 进行编辑,添加以下内容:(注意:zshrac是自己创建的,不要纠结自己找不到这个文件)

export HADOOP_HOME=/Users/finup/opt/hadoop-2.7.7
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

Then use the source ~ / .zshrac command changes to take effect, with respect to the above configuration Hadoop's all over.

4. Start hadoop 

(1) Start / close hadoop services

Enter sbin directory, use ./ start-dfs.sh command to start, and then use the jps see the results start, started successfully as shown below:

                                      

       We enter in the browser http: // localhost: 50070 Open the following pages, you can view and NameNode Datanode information, you can also view files in HDFS online.

                                                      

                                             

Use ./ stop-dfs.sh command to shut down the service hadoop

(2) Start / Close yarn Services

Use ./ start-yarn.sh service command to start the yarn, so that yarn to be responsible for resource management and task scheduling. After a successful start, use the jps command outputs the following information:

                                                

        More than just Hadoop service before the start of a NodeManager and ResourceManager, and then open the browser HTTP: // localhost: 8088 , you can see the task through a Web interface operation.

                                             

 

       Use ./ stop-yarn.sh command to close yarn service.

(3) rapid startup and shutdown

        Enter sbin directory, directly on the command ./start-all.sh and ./stop-all.sh commands to start and stop services hadoop and yarn at the same time, than turn up and shut down a lot easier.

III. Install mysql

可参考该链接:https://jingyan.baidu.com/article/fa4125ac0e3c2928ac709204.html  

四.安装hive

首先要保证hadoop和mysql已经安装好了

1.在mysql数据库创建hive用户

mysql> create user 'hive' identified by 'hive';

2.将mysql的所有权限授权给hive用户

mysql> grant all on *.* to 'hive'@'localhost' identified by 'hive';

3.刷新mysql使1、2步骤生效

mysql> flush privileges;

4.输入sql语句查询hive用户是否存在

mysql> select host,user,authentication_string from mysql.user;
+-----------+---------------+-------------------------------------------+
| host      | user          | authentication_string                     |
+-----------+---------------+-------------------------------------------+
| localhost | root          | *D391E96D137871ED52CDB352D867D3549815A718 |
| localhost | mysql.session | *THISISNOTAVALIDPASSWORDTHATCANBEUSEDHERE |
| localhost | mysql.sys     | *THISISNOTAVALIDPASSWORDTHATCANBEUSEDHERE |
| %         | hive          | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
| localhost | hive          | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
+-----------+---------------+-------------------------------------------+

5.使用hive用户登录mysql

wudejin:~ oldsix$ mysql -u hive -p
Enter password: hive
mysql> 

6.创建hive数据库

mysql> create database hive;

7.查看是否创建成功

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
6 rows in set (0.00 sec)

至此,前期的准备工作已完成,接下来,我们进入hive的安装过程。

8.下载hive安装包并解压

下载地址:https://mirrors.tuna.tsinghua.edu.cn/apache/hive/

下载完成后,通过命令行解压:

tar -zxvf apache-hive-3.1.1-bin.tar.gz

解压完成之后,对解压出来的文件夹重命名

mv apache-hive-3.1.1-bin hive3.1.1

9.修改hive配置:

进入hive3.1.1目录下的bin目录下,修改hive-site.xml配置文件

bin目录下不存在hive-site.xml文件,我们需要先复制一份:

cp hive-default.xml.template hive-site.xml

 修改hive-site.xml文件:

--修改数据库连接驱动名  (配置文件中需要将该配置去掉)
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>

--修改数据库连接URL (配置文件中需要将该配置去掉)
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
</property>

--修改数据库连接用户名  (配置文件中需要将该配置去掉)
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
    <description>Username to use against metastore database</description>
</property>

--修改数据库连接密码  (配置文件中需要将该配置去掉)
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
    <description>password to use against metastore database</description>
</property>

--修改hive数据目录(三处) (配置文件中需要将该配置去掉)
<property>
    <name>hive.querylog.location</name>
    <value>/Users/finup/opt/hive3.1.1/iotmp</value>
    <description>Location of Hive run time structured log file</description>
</property>
<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/Users/finup/opt/hive3.1.1/iotmp</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
    <name>hive.downloaded.resources.dir</name>
    <value>/Users/finup/opt/hive3.1.1/iotmp</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
</property>

--可以将表头显示出来 (配置文件中需要将该配置去掉)
<property>
    <name>hive.cli.print.header</name>
    <value>true</value>
    <description>Whether to print the names of the columns in query output.</description>
  </property>

10.配置hive环境变量

cd ~
sudo vi .base_profile

设置HIVE_HOME,并添加到PATH

export HIVE_HOME=/Users/finup/opt/hive3.1.1
export PATH=$PATH:$HIVE_HOME/bin

保存退出,并使环境变量生效

source .base_profile

11.将对应数据库的驱动包放到hive目录下的lib目录下

下载mysql-connector-java-8.0.16.jar,并上传至hive的lib目录下

12.初始化元数据库:schematool -dbType mysql -initSchema

13.进入hadoop安装目录,启动hadoop

/sbin/start-all.sh

14.启动hive

进入hive的bin目录下,执行命令: ./hive 

15.退出hive命令

exit

hive (zcfw_sda)> exit;

 

 

  

  

 

Guess you like

Origin www.cnblogs.com/dcx-1993/p/11122396.html