hadoop3.x M/R 实现wordcount

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u011066470/article/details/86561859

项目工程见:

1.mapper代码: WcMapper

2.reducer代码:WcReducer

3.job调用:

4.使用pom文件配置打包方式

5.会在工程下生成一个jar:且修改jar包为hadoop-demo.jar

6.jar包上传到linux系统上

7.待统计文件words.txt上传到hdfs上

Hadoop fs –put    /jurf_temp_data/words.txt     /words

8.执行jar:

[root@naidong jurf_temp_data]# hadoop jar hadoop-demo.jar

2019-01-04 11:59:50,916 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

2019-01-04 11:59:57,490 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.

2019-01-04 11:59:57,578 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1546572216242_0003

2019-01-04 12:00:00,103 INFO input.FileInputFormat: Total input files to process : 1

2019-01-04 12:00:01,179 INFO mapreduce.JobSubmitter: number of splits:1

2019-01-04 12:00:01,692 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled

2019-01-04 12:00:03,131 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1546572216242_0003

2019-01-04 12:00:03,135 INFO mapreduce.JobSubmitter: Executing with tokens: []

2019-01-04 12:00:04,502 INFO conf.Configuration: resource-types.xml not found

2019-01-04 12:00:04,503 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.

2019-01-04 12:00:05,256 INFO impl.YarnClientImpl: Submitted application application_1546572216242_0003

2019-01-04 12:00:05,504 INFO mapreduce.Job: The url to track the job: http://naidong:8088/proxy/application_1546572216242_0003/

2019-01-04 12:00:05,544 INFO mapreduce.Job: Running job: job_1546572216242_0003

2019-01-04 12:00:33,928 INFO mapreduce.Job: Job job_1546572216242_0003 running in uber mode : false

2019-01-04 12:00:33,952 INFO mapreduce.Job:  map 0% reduce 0%

2019-01-04 12:00:58,116 INFO mapreduce.Job:  map 100% reduce 0%

2019-01-04 12:01:30,074 INFO mapreduce.Job:  map 100% reduce 100%

2019-01-04 12:01:32,169 INFO mapreduce.Job: Job job_1546572216242_0003 completed successfully

2019-01-04 12:01:32,782 INFO mapreduce.Job: Counters: 53

File System Counters

FILE: Number of bytes read=195

FILE: Number of bytes written=426255

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=170

HDFS: Number of bytes written=76

HDFS: Number of read operations=8

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=1

Launched reduce tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=41852

Total time spent by all reduces in occupied slots (ms)=87912

Total time spent by all map tasks (ms)=20926

Total time spent by all reduce tasks (ms)=29304

Total vcore-milliseconds taken by all map tasks=20926

Total vcore-milliseconds taken by all reduce tasks=29304

Total megabyte-milliseconds taken by all map tasks=42856448

Total megabyte-milliseconds taken by all reduce tasks=90021888

Map-Reduce Framework

Map input records=6

Map output records=12

Map output bytes=165

Map output materialized bytes=195

Input split bytes=97

Combine input records=0

Combine output records=0

Reduce input groups=10

Reduce shuffle bytes=195

Reduce input records=12

Reduce output records=10

Spilled Records=24

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=461

CPU time spent (ms)=5640

Physical memory (bytes) snapshot=371195904

Virtual memory (bytes) snapshot=7103193088

Total committed heap usage (bytes)=165810176

Peak Map Physical memory (bytes)=229015552

Peak Map Virtual memory (bytes)=2739154944

Peak Reduce Physical memory (bytes)=142180352

Peak Reduce Virtual memory (bytes)=4364038144

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=73

File Output Format Counters

Bytes Written=76

2019-01-04 12:01:42,813 WARN util.ShutdownHookManager: ShutdownHook '' timeout, java.util.concurrent.TimeoutException

java.util.concurrent.TimeoutException

at java.util.concurrent.FutureTask.get(FutureTask.java:205)

at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:68)

[root@naidong jurf_temp_data]#h

7.查看统计结果:

结果文件一般由三部分组成: 
1) _SUCCESS文件:表示MapReduce运行成功。  
2) Part-r-00000文件:存放结果,也是默认生成的结果文件。 

都看到这里了,就顺手点击左上角的【关注】按钮,点击右上角的小手,给个评论,关注一下,再走呗!☺

猜你喜欢

转载自blog.csdn.net/u011066470/article/details/86561859
今日推荐