在centos 7.3上进行Apache HAWQ集群安装部署

一、前期准备工作

1、准备三台物理机,master(192.168.251.8),dataserver1(192.168.251.9),dataserver2(192.168.251.10);

2、目前最新版本是2.4.0,

官网下载地址:http://hawq.apache.org/

源码编辑及安装Apache官方文档地址为:https://cwiki.apache.org/confluence/display/HAWQ/Build+and+Install

3、此次选择安装部署的是在官网上打包号的安装包,下载地址: http://apache.org/dyn/closer.cgi/hawq/2.4.0.0/apache-hawq-rpm-2.4.0.0.tar.gz ,将部署包拷贝到相应的服务器中;

4、关闭防火墙

关闭防火墙:systemctl stop firewalld
关闭防火墙自动运行:systemctl disable firewalld
查看防火墙状态:systemctl status firewalld

5、HAWQ是基于hadoop的,在安装HAWQ前确保已经安装好了hadoop集群。

二、依赖项及前期配置

     前期保证网络正常

1、安装依赖项,依次执行下列命令

wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
# For CentOs 7 the link is https://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm
yum makecache
# On redhat7, make sure enabled rhel-7-server-extras-rpms and rhel-7-server-optional-rpms channel in /etc/yum.repos.d/redhat.repo
# Otherwise yum will prompt some packages(e.g. gperf) not be found
yum install -y man passwd sudo tar which git mlocate links make bzip2 net-tools \
  autoconf automake libtool m4 gcc gcc-c++ gdb bison flex gperf maven indent \
  libuuid-devel krb5-devel libgsasl-devel expat-devel libxml2-devel \
  perl-ExtUtils-Embed pam-devel python-devel libcurl-devel snappy-devel \
  thrift-devel libyaml-devel libevent-devel bzip2-devel openssl-devel \
  openldap-devel protobuf-devel readline-devel net-snmp-devel apr-devel \
  libesmtp-devel python-pip json-c-devel \
  java-1.7.0-openjdk-devel lcov cmake3 \
  openssh-clients openssh-server perl-JSON perl-Env
 
# need tomcat6 if enable-rps
# download from http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.44/
 
ln -s /usr/bin/cmake3 /usr/bin/cmake
pip --retries=50 --timeout=300 install pycrypto

2、修改系统环境参数,打开 vim /etc/sysctl.conf,添加如下配置

     kernel.shmmax = 1000000000
     kernel.shmmni = 4096
     kernel.shmall = 4000000000
     kernel.sem = 250 512000 100 2048
     kernel.sysrq = 1
     kernel.core_uses_pid = 1
     kernel.msgmnb = 65536
     kernel.msgmax = 65536
     kernel.msgmni = 2048
     net.ipv4.tcp_syncookies = 0
     net.ipv4.conf.default.accept_source_route = 0
     net.ipv4.tcp_tw_recycle = 1
     net.ipv4.tcp_max_syn_backlog = 200000
     net.ipv4.conf.all.arp_filter = 1
     net.ipv4.ip_local_port_range = 1281 65535
     net.core.netdev_max_backlog = 200000
     vm.overcommit_memory = 2
     fs.nr_open = 3000000
     kernel.threads-max = 798720
     kernel.pid_max = 798720
     #增加网络
     net.core.rmem_max = 2097152
     net.core.wmem_max = 2097152

执行以下命令将更新的  /etc/sysctl.conf  文件应用于操作系统配置:

sysctl -p

3、使用文本编辑器编辑  /etc/security/limits.conf  文件

#按照列出的确切顺序添加以下定义
#(请确保在编辑limits.conf之前应用fs.nr_open = 3000000,否则您可能无法ssh到您的实例)
 * soft nofile 2900000
 * hard nofile 2900000
 * soft nproc 131072
 * hard nproc 131072

4、新增 gpadmin用户(root下不能运行HAWQ)

useradd -m gpadmin -G root -s /bin/bash

passwd gpadmin

5、授予管理员权限

输入visudo命令,在打开的文件中找到 root ALL=(ALL) ALL 这一行

在底部补充添加一行 gpadmin  ALL=(ALL) ALL 保存退出

6、给gpadmin用户配置SSH无密码登录

1)进入gpadmin用户目录下,ssh-keygen -t rsa 生成各自机器对应的公钥文件;

2)将集群中各自公钥文件集成到authorized_keys文件中;

3)将authorized_keys文件拷贝到集群各个节点中的~/.ssh/目录下;

4)给authorized_keys授权文件,其它用户可以访问

chmod 600 authorized_keys。

三、安装HAWQ

1、解压HAWQ压缩包到目标目录/usr/local下

tar -zxvf apache-hawq-rpm-2.4.0.0.tar.gz  -C /usr/local/

2、进入解压后的目录下 执行rpm,安装hawq

cd hawq_rpm_packages/
rpm -ivh apache-hawq-2.4.0.0-el7.x86_64.rpm

3、将安装目录的所属用户及所属组改为gpadmin

chown -hR gpadmin /usr/local/apache-hawq/
chgrp -hR gpadmin /usr/local/apache-hawq/

4、在/usr/local/apache-hawq目录下创建 hawq_data_directory文件夹并在其中分别 创建文件夹masterdd及segment

mkdir /usr/local/apache-hawq/hawq-data-directory/masterdd
mkdir /usr/local/apache-hawq/hawq-data-directory/segmentdd

5、配置/usr/local/apache-hawq/etc目录下的hawq-site.xml文件(主要的配置信息如下,其它的默认保持不变)

<configuration>
        <property>
                <name>hawq_master_address_host</name>
                <value>master</value> 
                <description>The host name of hawq master.</description>
        </property>

        <property>
                <name>hawq_master_address_port</name>
                <value>5432</value>
                <description>The port of hawq master.</description>
        </property>

        <property>
                <name>hawq_standby_address_host</name>
                <value>none</value>
                <description>The host name of hawq standby master.</description>
        </property>

        <property>
                <name>hawq_segment_address_port</name>
                <value>40000</value>
                <description>The port of hawq segment.</description>
        </property>

        <property>
                <name>hawq_dfs_url</name>
                <value>master:9000/hawq_default</value> # 端口及ip地址与dfs的一致
                <description>URL for accessing HDFS.</description>
        </property>

        <property>
                <name>hawq_master_directory</name>
                <value>/usr/local/apache-hawq/hawq-data-directory/masterdd</value>
                <description>The directory of hawq master.</description>
        </property>

        <property>
                <name>hawq_segment_directory</name>
                <value>/usr/local/apache-hawq/hawq-data-directory/segmentdd</value>
                <description>The directory of hawq segment.</description>
        </property>
        <property>
                <name>hawq_master_temp_directory</name>
                <value>usr/local/apache-hawq/tmp</value>
                <description>The temporary directory reserved for hawq master.</description>
        </property>

        <property>
                <name>hawq_segment_temp_directory</name>
                <value>usr/local/apache-hawq/tmp</value>
                <description>The temporary directory reserved for hawq segment.</description>
        </property>

       

        <property>
                <name>hawq_rm_yarn_address</name>
<value>master:8032</value>
                <description>The address of YARN resource manager server.</description>
        </property>

        <property>
                <name>hawq_rm_yarn_scheduler_address</name>
                <value>master:8030</value>
                <description>The address of YARN scheduler server.</description>
        </property>

        <property>
                <name>hawq_rm_yarn_queue_name</name>
                <value>default</value>
                <description>The YARN queue name to register hawq resource manager.</description>
        </property>

     
        <property>
                <name>hawq_rps_address_port</name>
                <value>8432</value>
                <description>The port number of Ranger Plugin Serice. HAWQ RPS address is
                     http://$rps_host(hawq_master_address_host or hawq_standby_address_host):$hawq_rps_address_port/rps
                     For example, http://localhost:8432/rps
        </description>
        </property>

        <property>
                <name>default_hash_table_bucket_number</name>
                <value>12</value>
        </property>

</configuration>

注:master 和 standby 装在 hadoop namenode 和secondnamenode 上,   segmentdd 装在datanode所在服务器

5、配置pgadmin用户免密登陆

cd /usr/local/apache-hawq   #进入hawq目录中
source greenplum_path.sh
cd bin
./hawq ssh-exkeys -h master -h dataserver1 -h dataserver2

6、切换到hadoop用户,在hadoop创建hawq所需的文件夹,并改变文件夹所有者

su hadoop

hadoop dfs -mkdir /hawq_default

hadoop dfs -chown gpadmin:gpadmin /hawq_default

7、初始化hawq

cd /usr/local/apache-hawq/bin

./hawq init cluster

初始化后,默认hawq是启动状态;

8、启动和关闭hawq

    启动之前保证hadoop服务已启动

./hawq start cluster

./hawq stop cluster

9、在pg_hba.conf文件中添加如下

host all gpadmin 192.168.251.1/24 trust

可以远程访问(例如可以使用navicat工具)

官网文档:http://hawq.apache.org/docs/userguide/2.3.0.0-incubating/tutorial/overview.html

猜你喜欢

转载自blog.csdn.net/xuexi_39/article/details/83145950