hadoop集群搭建3之集群启动

前面集群已经成功搭建,现在来尝试启动集群。第一次系统启动的时候,是需要初始化的

启动zookeeper

1.启动zookeeper的命令:./zkServer.sh start|stop|status

[hadoop@hadoop001 ~]$3 zkServer.sh start (脚本已经被配置在路径下面了,所以不用再到zookeeper的bin目录下面执行)
JMX enabled by default
Using config: /home/hadoop/app/zookeeper-3.4.6/bin/../conf/zoo.cfg
[hadoop@hadoop001 ~]$3 zkServer.sh status(查看其状态)
Mode: leader(有一台机器的模式为leader,其他两台的机器为follower,说明其配置且启动成功)

启动hadoop(HDFS+YARN)

a.在格式化之前,先在journalnode节点机器上先启动JournalNode进程

[hadoop@hadoop003 hadoop]$3 cd /home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop001 ~]$3 sbin/hadoop-daemon.sh start journalnode (三台机器都启动)
[hadoop@hadoop001 ~]$3 jps
17857 JournalNode (JournalNode的进程)
17759 QuorumPeerMain (zookeeper的进程)

b.NameNode 格式化 (把第一台机器的namenode格式化,第二台的namenode启动不能也格式化)

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 hadoop namenode -format (把001台机器的namenode给格式化)
INFO common.Storage: Storage directory /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/name has been successfully formatted.

c.hadoop001 和hadoop002 上的namenode是一样的,但是不能两台都格式化,所以将hadoop001机器上的元数据同步到hadoop002机器上。主要是dfs.namenode.name.dir,dfs.namenode.edits.dir 还应该确保共享存储目录下(dfs.namenode.edits.dir)包含namenode所有的元数据

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 scp -r data hadoop002:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/ (直接把hadoop001机器的/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data文件夹传送到hadoop002机器上,替换掉hadoop002机器里面/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/文件夹下面的jn。反正三台机器的jn配置是一样的,所以不影响,再把hadoop001 上面的/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/下面的name和jn都给hadoop002  name里面就是namenode 的元数据)

d.初始化ZKFC
相当于是做初始化的一个元数据,session connected

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 hdfs zkfc -formatZK (只需要在当前机器上格式化。注意控制台上有一句话)
18/11/08 10:37:34 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ruozeclusterg5 in ZK.

e.启动hdfs分布式存储系统
在第一台机器上去启动分布式系统,它会自己去启动第二台第三台机器上的节点,要注意看一下它的启动顺序

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 start-dfs.sh(在当前机器启动分布式系统,sbin不用的原因是我们之前已经把sbin配置到个人环境变量中了)
Starting namenodes on [hadoop001 hadoop002]
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop001: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop001.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
hadoop003: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop003.out
Starting journal nodes [hadoop001 hadoop002 hadoop003]
hadoop001: journalnode running as process 17857. Stop it first.
hadoop003: journalnode running as process 3208. Stop it first.
hadoop002: journalnode running as process 3198. Stop it first.
Starting ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop002.out
hadoop001: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop001.out

启动完分布式系统,用jps查看一下进程是否三台机器都已经正常启动

[hadoop@hadoop003 hadoop-2.6.0-cdh5.7.0]$3 jps (看下三台机器上面的节点是否正常启动,如果没有启动,就去/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs下面查看日志)
###注:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs 下面包含namenode datanode journalnode  zkfc 的日志,当然有的机器上并没有部署zkfc namenode,所以就没有这两个日志
启动的时候发现第三台机器的datanode没有起来,日志说是 namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured
解决方法:考虑到是hdfs.site.xml文件配置可能出现问题,然后发现第三台机器没有,可能rz上传时候出现了问题,当时没注意,重新从第一台机器中scp到第三台上面,问题解决

小插曲:单进程的启动

namenode(hadoop001 ,hadoop002):
hadoop-daemon.sh start namenode

datanode  (hadoop001,hadoop002,hadoop003):
hadoop-daemon.sh start datanode

JournalNode (hadoop001, hadoop002 ,hadoop003):
hadoop-daemon.sh start journalnode

ZKFC(hadoop001,hadoop002)
hadoop-daemon.sh start zkfc

f.启动yarn
1.hadoop001机器上去启动yarn。(注意看启动顺序)

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 start-yarn.sh (在当前机器启动yarn)
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop001: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop001.out
hadoop003: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop003.out
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$3 jps (查看三台机器的进程)
发现三台机器的nodemanager都已经起来了,第一台的resourcemanager也起来了,但是第二台的resourcemanager没有起来,需要手动到第二台机器上去启动

2.hadoop002 备机启动resourcemanager

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh start resourcemanager
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$1 jps(查看该机器的rm是否已经起来)

小插曲:单进程的启动以及yarn进程的关闭

ResourceManager(hadoop001, hadoop002)
yarn-daemon.sh start resourcemanager

NodeManager(hadoop001, hadoop002, hadoop003)
yarn-daemon.sh start nodemanager
关闭如下:a.关闭hadoop    (从yarn到hdfs)
步骤1:[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 stop-yarn.sh
步骤2:[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh stop resourcemanager
步骤3:[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 stop-dfs.sh
b.关闭zookeeper
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$3 zkServer.sh stop (三台同时操作)

再次启动集群:
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$3 zkServer.sh start 
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 start-dfs.sh
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 start-yarn.sh
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$1 yarn-daemon.sh start resourcemanager

到此为止,hadoop集群的启动与关闭介绍完毕

web界面查看

hadoop001:
http://192.168.2.65:50070/
hadoop002:
http://192.168.2.199:50070/
resourcemanager(active):
http://192.168.2.65:8088
resourcemanager(standby):
http://192.168.2.199:8088/cluster/cluster
jobhistory:
http://192.168.2.65:19888/jobhistory

[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 hdfs haadmin -getServiceState nn1 (查看状态是active还是standby)
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$1 hdfs dfsadmin -report (监控集群状态)
[hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0] 1 1 HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver(查看运行的作业的历史job)

猜你喜欢

转载自blog.csdn.net/qq_42694416/article/details/84614803