Hadoop-完全分布式配置-集群配置

本文是根据尚硅谷的视频教学,然后加上自己理解总结的。

配置hdfs和yarn如下:
在这里插入图片描述

所有的配置文件都在/opt/module/hadoop-2.7.7/etc/hadoop目录下。

[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# pwd
/opt/module/hadoop-2.7.7/etc/hadoop
[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# ll
total 156
-rw-r--r-- 1 zhenghui zhenghui  4436 Jul 18  2018 capacity-scheduler.xml
-rw-r--r-- 1 zhenghui zhenghui  1335 Jul 18  2018 configuration.xsl
-rw-r--r-- 1 zhenghui zhenghui   318 Jul 18  2018 container-executor.cfg
-rw-r--r-- 1 zhenghui zhenghui  1096 Jan 29 07:18 core-site.xml
-rw-r--r-- 1 zhenghui zhenghui  3670 Jul 18  2018 hadoop-env.cmd
-rw-r--r-- 1 zhenghui zhenghui  4236 Jan 27 17:16 hadoop-env.sh
-rw-r--r-- 1 zhenghui zhenghui  2598 Jul 18  2018 hadoop-metrics2.properties
-rw-r--r-- 1 zhenghui zhenghui  2490 Jul 18  2018 hadoop-metrics.properties
-rw-r--r-- 1 zhenghui zhenghui  9683 Jul 18  2018 hadoop-policy.xml
-rw-r--r-- 1 zhenghui zhenghui  1113 Jan 29 08:23 hdfs-site.xml
-rw-r--r-- 1 zhenghui zhenghui  1449 Jul 18  2018 httpfs-env.sh
-rw-r--r-- 1 zhenghui zhenghui  1657 Jul 18  2018 httpfs-log4j.properties
-rw-r--r-- 1 zhenghui zhenghui    21 Jul 18  2018 httpfs-signature.secret
-rw-r--r-- 1 zhenghui zhenghui   620 Jul 18  2018 httpfs-site.xml
-rw-r--r-- 1 zhenghui zhenghui  3518 Jul 18  2018 kms-acls.xml
-rw-r--r-- 1 zhenghui zhenghui  1527 Jul 18  2018 kms-env.sh
-rw-r--r-- 1 zhenghui zhenghui  1631 Jul 18  2018 kms-log4j.properties
-rw-r--r-- 1 zhenghui zhenghui  5540 Jul 18  2018 kms-site.xml
-rw-r--r-- 1 zhenghui zhenghui 11801 Jul 18  2018 log4j.properties
-rw-r--r-- 1 zhenghui zhenghui   951 Jul 18  2018 mapred-env.cmd
-rw-r--r-- 1 zhenghui zhenghui  1380 Jan 28 06:44 mapred-env.sh
-rw-r--r-- 1 zhenghui zhenghui  4113 Jul 18  2018 mapred-queues.xml.template
-rw-r--r-- 1 zhenghui zhenghui  1187 Jan 29 21:40 mapred-site.xml
-rw-r--r-- 1 zhenghui zhenghui   758 Jul 18  2018 mapred-site.xml.template
-rw-r--r-- 1 zhenghui zhenghui    30 Jan 29 20:55 slaves
-rw-r--r-- 1 zhenghui zhenghui  2316 Jul 18  2018 ssl-client.xml.example
-rw-r--r-- 1 zhenghui zhenghui  2697 Jul 18  2018 ssl-server.xml.example
-rw-r--r-- 1 zhenghui zhenghui  2250 Jul 18  2018 yarn-env.cmd
-rw-r--r-- 1 zhenghui zhenghui  4564 Jan 28 06:38 yarn-env.sh
-rw-r--r-- 1 zhenghui zhenghui  1740 Jan 29 21:44 yarn-site.xml
[root@hadoop102 hadoop]# 

分别配置以下文件:
hadoop-env.sh
mapred-env.sh
yarn-env.sh

以上文件全部都添加JAVA环境变量

export JAVA_HOME=/opt/module/jdk1.8.0_221

配置hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>


<!--指定HDFS副本的数量,默认是三个,因为现在只有1个节点-->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
	
<!--指定Hadoop辅助名称节点主机配置-->
    <property>
	<name>dfs.namenode.secondary.http-address</name>
	<value>hadoop104:50090</value>
    </property>

</configuration>

配置mapred-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>


<!--指定MR运行在YARN上-->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
   <!--历史服务器端地址--> 
   <property>
	<name>mapreduce.jobhistory.address</name>
	<value>hadoop104:10020</value>
    </property>
   <!--历史服务器web端地址-->
    <property>
	<name>mapreduce.jobhistory.webapp.address</name>
	<value>hadoop104:19888</value>
    </property>
</configuration>

配置yarn-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>

<!--Reducer获取数据的方式 -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

	<!-- 指定YARN的ResourceManager的地址  -->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop103</value>
    </property>

	<property>
		<name>yarn.resourcemanager.address</name>
		<value>hadoop103:8032</value>
	</property>

	<property>
		<name>yarn.resourcemanager.scheduler.address</name>
		<value>hadoop103:8030</value>
	</property>

	<property>
		<name>yarn.resourcemanager.resource-tracker.address</name>
		<value>hadoop103:8031</value>
	</property>

	<property>
		<name>yarn.resourcemanager.admin.address</name>
		<value>hadoop103:8033</value>
	</property>

	<property>
		<name>yarn.resourcemanager.webapp.address</name>
		<value>hadoop103:8088</value>
	</property>


	<!--日志聚集功能使能-->
	<property>
	  <name>yarn.log-aggregation-enable</name>
	  <value>true</value>
	</property>
	<!-- 日志保留时间设置7天 -->
	<property>
	  <name>yarn.log-aggregation.retain-seconds</name>
	  <value>604800</value>
	</property>
</configuration>

配置slaves

添加从服务器地址

hadoop102
hadoop103
hadoop104

修改完之后记得把配置文件同步到其他的机器上

copypath是一个脚本,如果大家不知道怎么写,可以访问:
https://blog.csdn.net/qq_17623363/article/details/104108316

[root@hadoop102 hadoop]# 
[root@hadoop102 hadoop]# cd ..
[root@hadoop102 etc]# cd ..
[root@hadoop102 hadoop-2.7.7]# 
[root@hadoop102 hadoop-2.7.7]# 
[root@hadoop102 hadoop-2.7.7]# copypath etc
count=1
fname=etc
pdir=/opt/module/hadoop-2.7.7
-------------hadoop102-------------
root@hadoop102's password: 
sending incremental file list

sent 920 bytes  received 18 bytes  208.44 bytes/sec
total size is 80,771  speedup is 86.11
-------------hadoop103-------------
root@hadoop103's password: 
sending incremental file list
etc/hadoop/
etc/hadoop/hdfs-site.xml
etc/hadoop/yarn-site.xml

sent 3,216 bytes  received 93 bytes  441.20 bytes/sec
total size is 80,771  speedup is 24.41
-------------hadoop104-------------
root@hadoop104's password: 
sending incremental file list
etc/hadoop/
etc/hadoop/hdfs-site.xml
etc/hadoop/yarn-site.xml

sent 3,216 bytes  received 93 bytes  945.43 bytes/sec
total size is 80,771  speedup is 24.41
[root@hadoop102 hadoop-2.7.7]# 
[root@hadoop102 hadoop-2.7.7]# 

启动

hadoop102 上: start-dfs.sh

hadoop103上:start-yarn.sh

hadoop104上:hadoop-daemon.sh start secondarynamenode
hadoop104上:mr-jobhistory-daemon.sh start historyserver //历史服务功能

如果历史服务功能不知道怎么配置,可以查看:
https://blog.csdn.net/qq_17623363/article/details/104112878

查看进程是否全部启动

[root@hadoop102 ~]# 
[root@hadoop102 ~]# jps
16851 NameNode
17240 NodeManager
20187 Jps
16991 DataNode
[root@hadoop102 ~]# 

[root@hadoop103 ~]# 
[root@hadoop103 ~]# jps
5875 NodeManager
5764 ResourceManager
7015 Jps
5645 DataNode
[root@hadoop103 ~]# 
[root@hadoop103 ~]# 
[root@hadoop103 ~]# 


[zhenghui@hadoop104 ~]$ 
[zhenghui@hadoop104 ~]$ 
[zhenghui@hadoop104 ~]$ jps
5681 DataNode
5866 NodeManager
5788 SecondaryNameNode
7357 Jps
6014 JobHistoryServer
[zhenghui@hadoop104 ~]$ 

测试结果

1、访问http://hadoop102:50070/

在这里插入图片描述

2、访问http://hadoop103:8088/

在这里插入图片描述

3、测试上传

[zhenghui@hadoop102 hadoop-2.7.7]$ touch newt1.txt  
[zhenghui@hadoop102 hadoop-2.7.7]$ echo "This is newt1.txt" >> newt1.txt 
[zhenghui@hadoop102 hadoop-2.7.7]$ hadoop fs -put newt1.txt /                        [zhenghui@hadoop102 hadoop-2.7.7]$ 

可以看出上传成功了。
在这里插入图片描述

4、测试wordcount统计功能是否可用

[zhenghui@hadoop102 hadoop-2.7.7]$ 
[zhenghui@hadoop102 hadoop-2.7.7]$ 
[zhenghui@hadoop102 hadoop-2.7.7]$ 
[zhenghui@hadoop102 hadoop-2.7.7]$ hadoop fs -put newt1.txt /                        [zhenghui@hadoop102 hadoop-2.7.7]$ 
[zhenghui@hadoop102 hadoop-2.7.7]$ 
[zhenghui@hadoop102 hadoop-2.7.7]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /newt1.txt /newt1_output.txt
20/01/30 13:35:23 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.1.103:8032
20/01/30 13:35:23 INFO input.FileInputFormat: Total input paths to process : 1
20/01/30 13:35:24 INFO mapreduce.JobSubmitter: number of splits:1
20/01/30 13:35:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1580352839303_0002
20/01/30 13:35:24 INFO impl.YarnClientImpl: Submitted application application_1580352839303_0002
20/01/30 13:35:24 INFO mapreduce.Job: The url to track the job: http://hadoop103:8088/proxy/application_1580352839303_0002/
20/01/30 13:35:24 INFO mapreduce.Job: Running job: job_1580352839303_0002
20/01/30 13:35:30 INFO mapreduce.Job: Job job_1580352839303_0002 running in uber mode : false
20/01/30 13:35:30 INFO mapreduce.Job:  map 0% reduce 0%
20/01/30 13:35:36 INFO mapreduce.Job:  map 100% reduce 0%
20/01/30 13:35:46 INFO mapreduce.Job:  map 100% reduce 100%
20/01/30 13:35:46 INFO mapreduce.Job: Job job_1580352839303_0002 completed successfully
20/01/30 13:35:47 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=42
                FILE: Number of bytes written=245329
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=114
                HDFS: Number of bytes written=24
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters 
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=3606
                Total time spent by all reduces in occupied slots (ms)=6806
                Total time spent by all map tasks (ms)=3606
                Total time spent by all reduce tasks (ms)=6806
                Total vcore-milliseconds taken by all map tasks=3606
                Total vcore-milliseconds taken by all reduce tasks=6806
                Total megabyte-milliseconds taken by all map tasks=3692544
                Total megabyte-milliseconds taken by all reduce tasks=6969344
        Map-Reduce Framework
                Map input records=1
                Map output records=3
                Map output bytes=30
                Map output materialized bytes=42
                Input split bytes=96
                Combine input records=3
                Combine output records=3
                Reduce input groups=3
                Reduce shuffle bytes=42
                Reduce input records=3
                Reduce output records=3
                Spilled Records=6
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=133
                CPU time spent (ms)=1570
                Physical memory (bytes) snapshot=411852800
                Virtual memory (bytes) snapshot=4210016256
                Total committed heap usage (bytes)=269484032
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters 
                Bytes Read=18
        File Output Format Counters 
                Bytes Written=24
[zhenghui@hadoop102 hadoop-2.7.7]$

可以看出测试成功
在这里插入图片描述

点进去,下载查看是否成功
在这里插入图片描述

统计成功
在这里插入图片描述

发布了101 篇原创文章 · 获赞 76 · 访问量 3万+

猜你喜欢

转载自blog.csdn.net/qq_17623363/article/details/104109534