三节点分布式创建
1.主机网络配置
2.创建hadoop用户-主机互信(三节点)
3.JDK安装(root用户 三节点)
4.修改配置文件
5.分发
三节点的hadoop分布式集群搭建
Hadoop安装文档
CentOS系统配置
centos7关闭防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service
关闭selinux
setenforce 0
vim /etc/selinux/config
SELINUX=disabled
centos7配置主机名
临时修改主机名
hostname master
永久修改主机名
vim /etc/hostname
master
修改网卡
TYPE=“Ethernet”
BOOTPROTO=“static”
NAME=“ens33”
DEVICE=“ens33”
IPADDR=“192.168.31.200”
ONBOOT=“yes”
重启网络
systemctl retart network.service
显示绝对路径
配置显示绝对路径
export PS1="[\u@\h \w]$"
YUM配置(三节点)
cd /etc/yum.repos.d/
rm -rf *
[base]
name=Base
enabled=1
baseurl=file:///media
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
挂载yum源
mount /dev/sr0 /media
#更新yum缓存
yum makecache
yum repolist
Java环境配置(三节点root)
目前hadoop安装需要JDK8.0以上版本,同时JDK需要ORACLE官方JDK,不能使用OPENJDK
tar -xvf jdk-8u144-linux-x64.tar
cp -rf jdk-8u144-linux-x64.tar /usr/local/jdk
yum -y remove java
vim /etc/profile
export JAVA_HOME=/usr/local/jdk
export PATH=
JAVA_HOME/bin
scp -r /usr/local/jdk slave1:/usr/local/jdk
通过ssh向远程主机拷贝文件和目录(-r)
创建HADOOP用户及相关目录(三节点root)
创建hadoop用户
useradd hadoop
passwd hadoop
创建hadoop安装目录
mkdir /home/hadoop/install
配置互信(hadoop用户)
su - hadoop
三节点生成公钥私钥
ssh-keygen
进入三节点ssh目录
cd /home/hadoop/.ssh
scp id_rsa.pub 192.168.2.99:~/.ssh/id_rsa.pub2
scp id_rsa.pub 192.168.2.99:~/.ssh/id_rsa.pub1
在master节点执行
cat id_rsa.pub1 id_rsa.pub id_rsa.pub2 >authorized_keys
分发authorized_keys到其他节点
修改authorized_keys权限
chmod 600 authorized_keys
scp authorized_keys slave1:~/.ssh/
scp authorized_keys slave2:~/.ssh/
测试互信是否搭建成功
ssh slave1 date
修改hadoop配置文件(hadoop用户)
~/install/hadoop-2.7.4/etc/hadoop/
core-site.xml
hdfs-site.xml
yarn-site.xml
mapred-site.xml
slaves
vim hdfs-site.xml
dfs.replication 2 dfs.namenode.edits.dir /data/hadoop/namenode/name dfs.datanode.data.dir /data/hadoop/datanode/data dfs.namenode.checkpoint.dir /data/hadoop/namenode/namesecondaryvim mapred-site.xml
mapreduce.framework.name yarn mapreduce.jobhistory.webapp.address master:19888vim yarn-site.xml
yarn.resourcemanager.hostname master yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.webapp.address master:8088 yarn.nodemanager.resource.memory-mb 3000 yarn.nodemanager.resource.cpu-vcores 4 yarn.scheduler.maximum-allocation-mb 3000 yarn.scheduler.minimum-allocation-mb 500 yarn.scheduler.maximum-allocation-vcores 4在调用脚本的过程中,start-dfs.sh和start-yarn.sh会使用该脚本进行datanode和ndoemanager的启动
vim slaves
master
slave1
slave2
分发其他节点(hadoop用户)
mkdir -p ~/install
scp -r ~/install/hadoop-2.7.4/ slave1:~/install/hadoop-2.7.4/
scp -r ~/install/hadoop-2.7.4/ slave2:~/install/hadoop-2.7.4/
创建数据目录(三节点 root用户 )
一般是独立挂存储
mkdir -p /data/hadoop/tmp
mkdir -p /data/hadoop/namenode/name
mkdir -p /data/hadoop/datanode/data
mkdir -p /data/hadoop/namenode/namesecondary
chown hadoop:hadoop -R /data/hadoop
修改hadoop环境变量
vim ~/.bash_profile
添加
HADOOP_HOME=/home/hadoop/install/hadoop-2.7.4
PATH=
HOME/bin:
HADOOP_HOME/sbin
source ~/.bash_profile
将配置好的hadoop分发到其他节点
格式化并且启动HDFS(hadoop)
hdfs namenode -format
格式化成功显示
17/07/30 15:05:20 INFO common.Storage: Storage directory /data/hadoop/tmp/dfs/name has been successfully formatted.
17/07/30 15:05:20 INFO common.Storage: Storage directory /data/hadoop/namenode/name has been successfully formatted.
所有启动脚本在/home/hadoop/install/hadoop/sbin⽬录下
启动HDFS(启动namenode和datanode)
或者可以使用下列命令启动单个节点(在某个节点启动)
hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
hadoop-daemon.sh start secondarynamenode
启动YARN框架(启动resourcemanager和nodemanager)
或者可以使⽤(在某个节点启动)
yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
启动历史服务器(启动JobHistoryServer)
mr-jobhistory-daemon.sh start historyserver
相关得HADOOP管理页面
启动后查看HDFS管理⻚⾯
http://192.168.1.30:50070/
查看yarn框架管理⻚⾯
http://192.168.1.30:8088
查看历史服务器⻚⾯
http://192.168.1.31:19888/jobhistory
HADOOP日志目录
/usr/local/hadoop/logs
相关得HADOOP管理端口
hadoop的端口
50070 //namenode http port
50075 //datanode http port
50090 //2namenode http port
8020 //namenode rpc port
50010 //datanode rpc port
9000 //HDFS client
8032 //resourcemanager client