搭建hadoop2.6.0 HDFS HA及YARN HA

最终结果:
[hadoop@h41 ~]$ jps
12723 ResourceManager
12995 Jps
12513 NameNode
12605 DFSZKFailoverController

[hadoop@h42 ~]$ jps
12137 ResourceManager
12233 Jps
12009 DFSZKFailoverController
11930 NameNode

[hadoop@h43 ~]$ jps
12196 DataNode
12322 NodeManager
12435 Jps
11965 QuorumPeerMain
12050 JournalNode

[hadoop@h44 ~]$ jps
11848 QuorumPeerMain
11939 JournalNode
12309 Jps
12156 NodeManager
12032 DataNode

[hadoop@h45 ~]$ jps
12357 Jps
11989 JournalNode
11904 QuorumPeerMain
12204 NodeManager
12080 DataNode

角色分配:

 

h41 NameNode DFSZKFailoverController ResourceManager        
h42 NameNode DFSZKFailoverController ResourceManager        
h43       NodeManager JournalNode QuorumPeerMain DataNode
h44       NodeManager JournalNode QuorumPeerMain DataNode
h45       NodeManager JournalNode QuorumPeerMain DataNode


说明:在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode

一、准备环境:
关闭防火墙和selinux(所有虚拟机)
service iptables stop
chkconfig iptables off(设置自动启动为关闭)

setenforce 0
vi /etc/selinux/config
SELINUX=disabled

配置主机名和hosts(所有虚拟机)
vi /etc/sysconfig/network
修改HOSTNAME为HOSTNAME=h41~h45

vi /etc/hosts(上面那步没有好像也不影响什么,但是这步必须有,所有虚拟机,都可以把初始内容都删掉后添加如下内容)
192.168.8.41    h41
192.168.8.42    h42
192.168.8.43    h43
192.168.8.44    h44
192.168.8.45    h45

所有机器同步时间
ntpdate 202.120.2.101(这种方法我没有成功,好像得先yum安装ntpdate并且可以联网,参考文章:https://my.oschina.net/myaniu/blog/182959和http://www.cnblogs.com/liuyou/archive/2012/07/29/2614330.html和http://blog.csdn.net/lixianlin/article/details/7045321)
这里我用了最笨的方法:所有虚拟机都在root用户下执行date -s "2017-05-05 12:00:00"并且重启虚拟机

创建hadoop用户和组(所有虚拟机)
groupadd hadoop
useradd -g hadoop hadoop
passwd hadoop

切换hadoop用户(所有虚拟机)
su - hadoop

配置密钥验证免密码登录[所有虚拟机都要做一遍]
h41:
ssh-keygen -t rsa -P ''
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
chmod 700 ~/.ssh/
chmod 600 ~/.ssh/authorized_keys
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h42
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h43
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h44
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h45

h42:
ssh-keygen -t rsa -P ''
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
chmod 700 ~/.ssh/
chmod 600 ~/.ssh/authorized_keys
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h41
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h43
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h44
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h45
。。。。。。。。。(h43,h44,h45重复以上步骤)

验证:ssh 'hadoop@h42' (h42~h45可以远程登录其他虚拟机)

创建备用目录
mkdir -pv /home/hadoop/storage/hadoop/tmp
mkdir -pv /home/hadoop/storage/hadoop/name
mkdir -pv /home/hadoop/storage/hadoop/data
mkdir -pv /home/hadoop/storage/hadoop/journal
mkdir -pv /home/hadoop/storage/yarn/local
mkdir -pv /home/hadoop/storage/yarn/logs
mkdir -pv /home/hadoop/storage/hbase
mkdir -pv /home/hadoop/storage/zookeeper/data
mkdir -pv /home/hadoop/storage/zookeeper/logs

scp -r /home/hadoop/storage h42:/home/hadoop/
scp -r /home/hadoop/storage h43:/home/hadoop/
scp -r /home/hadoop/storage h44:/home/hadoop/
scp -r /home/hadoop/storage h45:/home/hadoop/

安装jdk1.7和hadoop并配置环境变量,可以配置全局的(修改/etc/profile)也可以配置当前用户的(修改~/.bashrc文件),这里我配置是当前用户的环境变量(包括hbase和hive的我也配了,虽然目前用不上)

h41切换root用户
安装jdk
[root@h41 usr]# tar -zxvf jdk-7u25-linux-i586.tar.gz
[root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h42:/usr/ (这些步骤需要root用户密码。。。)
[root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h43:/usr/
[root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h44:/usr/
[root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h45:/usr/

h41再切换回hadoop用户
vi ~/.bashrc

export JAVA_HOME=/usr/jdk1.7.0_25
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
##java
export HADOOP_HOME=/home/hadoop/hadoop
export HIVE_HOME=/home/hadoop/hive
export HBASE_HOME=/home/hadoop/hbase
##hadoop hbase hive
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin

scp ~/.bashrc h42:~/.bashrc
scp ~/.bashrc h43:~/.bashrc
scp ~/.bashrc h44:~/.bashrc
scp ~/.bashrc h45:~/.bashrc

使环境变量生效,并且所有虚拟机都执行一遍,否则的话jps命令不好使,并最终导致zookeeper无法启动成功
[hadoop@h41 ~]$ source ~/.bashrc
[hadoop@h42 ~]$ source ~/.bashrc
[hadoop@h43 ~]$ source ~/.bashrc
[hadoop@h44 ~]$ source ~/.bashrc
[hadoop@h45 ~]$ source ~/.bashrc

二、部署hadoop-2.6.0的namenoe HA、resource manager HA
解压、改名
tar -zxvf hadoop-2.6.0.tar.gz -C /home/hadoop
cd /home/hadoop
mv hadoop-2.6.0 hadoop

配置hadoop环境变量[准备环境时已做,略]

验证hadoop安装成功
hadoop version

修改hadoop配置文件
vi /home/hadoop/hadoop/etc/hadoop/core-site.xml

添加:

<!-- 指定hdfs的nameservice为gagcluster,是NameNode的URI。hdfs://主机名:端口/ -->
(这里被http://www.it610.com/article/3334284.htm的博主死坑了一把,原文章中这里写的是<value>hdfs://gagcluster:9000</value>正确的写法应该是把端口去掉,这样写<value>hdfs://gagcluster</value>,否则在搭建完成之后在执行hadoop fs -mkdir /input的时候却报错:
mkdir: Port 9000 specified in URI hdfs://gagcluster:9000 but host 'gagcluster' is a logical (HA) namenode and does not use port information.)
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://gagcluster</value>
  </property>
 
  <property>
    <name>io.file.buffer.size</name>
    <value>131072</value>
  </property>
 
<!-- 指定hadoop临时目录 -->
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/storage/hadoop/tmp</value>
    <description>Abase for other temporary directories.</description>
  </property>
 
<!--指定可以在任何IP访问-->
  <property>
    <name>hadoop.proxyuser.hduser.hosts</name>
    <value>*</value>
  </property>
 
<!--指定所有用户可以访问-->
  <property>
    <name>hadoop.proxyuser.hduser.groups</name>
    <value>*</value>
  </property>
 
<!-- 指定zookeeper地址 -->
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>h43:2181,h44:2181,h45:2181</value>
  </property>


vi /home/hadoop/hadoop/etc/hadoop/hdfs-site.xml
添加:

<!--节点黑名单列表文件,用于下线hadoop节点 -->
<property>
  <name>dfs.hosts.exclude</name>
  <value>/home/hadoop/hadoop/etc/hadoop/exclude</value>
</property>
 
<!--指定hdfs的block大小64M -->
  <property> 
    <name>dfs.block.size</name> 
    <value>67108864</value>
  </property>
 
<!--指定hdfs的nameservice为gagcluster,需要和core-site.xml中的保持一致 -->
  <property>
    <name>dfs.nameservices</name>
    <value>gagcluster</value>
  </property>
 
<!-- gagcluster下面有两个NameNode,分别是nn1,nn2 -->
  <property>
    <name>dfs.ha.namenodes.gagcluster</name>
    <value>nn1,nn2</value>
  </property>
 
<!-- nn1的RPC通信地址 -->
  <property>
    <name>dfs.namenode.rpc-address.gagcluster.nn1</name>
    <value>h41:9000</value>
  </property>
 
<!-- nn1的http通信地址 -->
  <property>
    <name>dfs.namenode.http-address.gagcluster.nn1</name>
    <value>h41:50070</value>
  </property>
 
<!-- nn2的RPC通信地址 -->
  <property>
    <name>dfs.namenode.rpc-address.gagcluster.nn2</name>
    <value>h42:9000</value>
  </property>
 
<!-- nn2的http通信地址 -->
  <property>
    <name>dfs.namenode.http-address.gagcluster.nn2</name>
    <value>h42:50070</value>
  </property>
 
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://h43:8485;h44:8485;h45:8485/gagcluster</value>
  </property>
 
<!-- 配置失败自动切换实现方式 -->
  <property>
    <name>dfs.client.failover.proxy.provider.gagcluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>
 
<!-- 配置隔离机制 -->
  <property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
  </property>
 
<!-- 使用隔离机制时需要ssh免密码登陆 -->
  <property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hadoop/.ssh/id_rsa</value>
  </property>
 
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
  <property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/home/hadoop/storage/hadoop/journal</value>
  </property>
 
<!--指定支持高可用自动切换机制-->
  <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
  </property>
 
<!--指定namenode名称空间的存储地址-->
  <property>  
    <name>dfs.namenode.name.dir</name>  
    <value>/home/hadoop/storage/hadoop/name</value> 
  </property>
 
 <!--指定datanode数据存储地址-->
  <property>  
    <name>dfs.datanode.data.dir</name>  
    <value>file:/home/hadoop/storage/hadoop/data</value> 
  </property>
 
<!--指定数据冗余份数-->
  <property>  
    <name>dfs.replication</name>  
    <value>3</value>
  </property>
 
<!--指定可以通过web访问hdfs目录-->
  <property> 
    <name>dfs.webhdfs.enabled</name> 
    <value>true</value>
  </property>
 
<!--保证数据恢复 --> 
  <property> 
    <name>dfs.journalnode.http-address</name> 
    <value>0.0.0.0:8480</value> 
  </property>
 
  <property> 
    <name>dfs.journalnode.rpc-address</name> 
    <value>0.0.0.0:8485</value> 
  </property>
 
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>h43:2181,h44:2181,h45:2181</value>
  </property>


vi /home/hadoop/hadoop/etc/hadoop/mapred-site.xml
添加:

<configuration>
<!-- 配置MapReduce运行于yarn中 -->
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
 
<!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>0.0.0.0:10020</value>
  </property>
 
<!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>0.0.0.0:19888</value>
  </property>
</configuration>


vi /home/hadoop/hadoop/etc/hadoop/yarn-site.xml
添加:

<!--日志聚合功能-->
  <property>
     <name>yarn.log-aggregation-enable</name>
     <value>true</value>
  </property>
 
<!--在HDFS上聚合的日志最长保留多少秒。3天-->
  <property>
     <name>yarn.log-aggregation.retain-seconds</name>
     <value>259200</value>
  </property>
 
<!--rm失联后重新链接的时间-->
  <property>
     <name>yarn.resourcemanager.connect.retry-interval.ms</name>
     <value>2000</value>
  </property>
 
<!--开启resource manager HA,默认为false-->
  <property>
     <name>yarn.resourcemanager.ha.enabled</name>
     <value>true</value>
  </property>
 
<!--配置resource manager -->
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
 
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>h43:2181,h44:2181,h45:2181</value>
  </property>
 
<!--开启故障自动切换-->
  <property>
     <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
     <value>true</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.hostname.rm1</name>
    <value>h41</value>
  </property>
                    
  <property>
     <name>yarn.resourcemanager.hostname.rm2</name>
     <value>h42</value>
  </property>
 
<!--在namenode1上配置rm1,在namenode2上配置rm2,注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改-->
  <property>
    <name>yarn.resourcemanager.ha.id</name>
    <value>rm1</value>
  <description>If we want to launch more than one RM in single node, we need this configuration</description>
  </property>
 
<!--开启自动恢复功能-->
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
  </property>
 
<!--配置与zookeeper的连接地址-->
  <property>
    <name>yarn.resourcemanager.zk-state-store.address</name>
    <value>h43:2181,h44:2181,h45:2181</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.store.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.zk-address</name>
    <value>h43:2181,h44:2181,h45:2181</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.cluster-id</name>
    <value>gagcluster-yarn</value>
  </property>
 
<!--schelduler失联等待连接时间-->
  <property>
    <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
    <value>5000</value>
  </property>
 
<!--配置rm1-->
  <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>h41:8132</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm1</name>
    <value>h41:8130</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>h41:8188</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
    <value>h41:8131</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.admin.address.rm1</name>
    <value>h41:8033</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.ha.admin.address.rm1</name>
    <value>h41:23142</value>
  </property>
 
<!--配置rm2-->
  <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>h42:8132</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.scheduler.address.rm2</name>
    <value>h42:8130</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.webapp.address.rm2</name>
    <value>h42:8188</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
    <value>h42:8131</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.admin.address.rm2</name>
    <value>h42:8033</value>
  </property>
 
  <property>
    <name>yarn.resourcemanager.ha.admin.address.rm2</name>
    <value>h42:23142</value>
  </property>
 
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
 
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
 
  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/home/hadoop/storage/yarn/local</value>
  </property>
 
  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/home/hadoop/storage/yarn/logs</value>
  </property>
 
  <property>
    <name>mapreduce.shuffle.port</name>
    <value>23080</value>
  </property>
 
<!--故障处理类-->
  <property>
    <name>yarn.client.failover-proxy-provider</name>
    <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
  </property>
 
  <property>
      <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>
      <value>/yarn-leader-election</value>
      <description>Optional setting. The default value is /yarn-leader-election</description>
  </property>

配置DataNode节点
vi /home/hadoop/hadoop/etc/hadoop/slaves

h43
h44
h45

创建exclude文件,用于以后下线hadoop节点
touch /home/hadoop/hadoop/etc/hadoop/exclude

同步hadoop工程到h42~45机器上面
scp -r /home/hadoop/hadoop h42:/home/hadoop/
scp -r /home/hadoop/hadoop h43:/home/hadoop/
scp -r /home/hadoop/hadoop h44:/home/hadoop/
scp -r /home/hadoop/hadoop h45:/home/hadoop/

修改nn2(h42)配置文件yarn-site.xml
修改这一处为:

<property>
    <name>yarn.resourcemanager.ha.id</name>
    <value>rm2</value>
  <description>If we want to launch more than one RM in single node, we need this configuration</description>
  </property>

三、部署zookeeper3.4.5三节点完全分布式集群
使用三台服务器安装zookeeper,安装在hadoop用户上(zookeeper最好是奇数安装,其实我这五台机器任意三台安装都可以,我这里选择了h43,h44,h45这三台虚拟机来安装)
h43 192.168.8.43
h44 192.168.8.44
h45 192.168.8.45

解压、改名(在h43上)
tar xf zookeeper-3.4.5.tar.gz -C /home/hadoop/
mv /home/hadoop/zookeeper-3.4.5/ /home/hadoop/zookeeper
cd /home/hadoop/zookeeper

修改配置文件
vi /home/hadoop/zookeeper/conf/zoo.cfg

tickTime=2000
initLimit=5
syncLimit=2
dataDir=/home/hadoop/storage/zookeeper/data
dataLogDir=/home/hadoop/storage/zookeeper/logs
clientPort=2181
server.1=h43:2888:3888
server.2=h44:2888:3888
server.3=h45:2888:3888

同步到h44、h45节点
scp -r /home/hadoop/zookeeper h44:/home/hadoop
scp -r /home/hadoop/zookeeper h45:/home/hadoop

创建zookeeper的数据文件和日志存放目录[准备环境已做,此步骤略]

h43~45分别编辑myid值 
echo 1 > /home/hadoop/storage/zookeeper/data/myid
echo 2 > /home/hadoop/storage/zookeeper/data/myid
echo 3 > /home/hadoop/storage/zookeeper/data/myid

 

###########################################################################################
Hadoop集群首次启动过程
###########################################################################################
 1.如果zookeeper集群还没有启动的话, 首先把各个zookeeper起来。
/home/hadoop/zookeeper/bin/zkServer.sh start    (记住所有的zookeeper机器都要启动)
/home/hadoop/zookeeper/bin/zkServer.sh status (1个leader,n-1个follower)
输入jps,会显示启动进程:QuorumPeerMain

2.、然后在主namenode节点(h41)执行如下命令,创建命名空间
/home/hadoop/hadoop/bin/hdfs zkfc -formatZK

3、在h43,h44,h45节点用如下命令启日志程序
/home/hadoop/hadoop/sbin/hadoop-daemon.sh start journalnode

4、在主namenode节点用./bin/hadoop namenode -format格式化namenode和journalnode目录
/home/hadoop/hadoop/bin/hadoop namenode -format
验证成功
在zookeeper节点执行
/home/hadoop/zookeeper/bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[hadoop-ha, zookeeper]
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha 
[gagcluster]
[zk: localhost:2181(CONNECTED) 2] quit

5、在主namenode节点启动namenode进程
/home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode

6、在备namenode节点执行第一行命令,把备namenode节点的目录格式化并把元数据从主namenode节点copy过来,并且这个命令不会把journalnode目录再格式化了!然后用第二个命令启动备namenode进程!
/home/hadoop/hadoop/bin/hdfs namenode -bootstrapStandby【或者直接scp -r /home/hadoop/storage/hadoop/name h42:/home/hadoop/storage/hadoop】
/home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode

7、在两个namenode节点都执行以下命令
/home/hadoop/hadoop/sbin/hadoop-daemon.sh start zkfc

8、启动datanode
方法一、
在所有datanode节点都执行以下命令启动datanode(我在h43上执行后h43,h44,h45的DataNode就都启动了)
/home/hadoop/hadoop/sbin/hadoop-daemons.sh start datanode
方法二、
启动datanode节点多的时候,可以直接在主NameNode(nn1)上执行如下命令一次性启动所有datanode
/home/hadoop/hadoop/sbin/hadoop-daemons.sh start datanode

9. 启动YARN(在namenode1和namenode2上执行)
/home/hadoop/hadoop/sbin/start-yarn.sh

注意:
在namenode2上执行此命令时会提示NodeManager已存在等信息不用管这些,主要是启动namenode2上的resourceManager完成与namenode1的互备作用,目前没有找到单独启动resourceManager的方法
启动完成之后可以在浏览器中输入http://192.168.8.41:50070和http://192.168.8.42:50070查看namenode分别为Standby和Active。
在namenode1上执行${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1查看rm1和rm2分别为active和standby状态,也可以通过浏览器访问http://192.168.8.41:8188查看状态

 

验证YARN:
然后我想运行一个mr小程序:
(详情请看我的另一篇文章:新装的hadoop2版本无法运行mapreduce的解决方法
可以在h41,h42上成功运行。
[hadoop@h41 ~]$ hadoop fs -cat /output/part-00000
hadoop  1
hello   3
hive    1
world   1
[hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1
active
[hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm2
standby
[hadoop@h41 ~]$ jps
12723 ResourceManager
14752 Jps
12513 NameNode
12605 DFSZKFailoverController
[hadoop@h41 ~]$ kill -9 12723
[hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1

[hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm2
active
手动启动那个挂掉的ResourceManager
[hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1
standby

验证HDFS HA:
然后再kill -9掉active的NameNode
通过浏览器访问:http://192.168.8.41:50070
这个时候h41上的NameNode变成了active 
在执行命令:hadoop fs -cat /output/part-00000
刚才上传的文件依然存在!!! 
手动启动那个挂掉的NameNode 
/home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode
通过浏览器访问:http://192.168.8.42:50070 
NameNode ‘h42’ (standby)

猜你喜欢

转载自www.cnblogs.com/jieran/p/9314136.html