MapReduce(2)--分布式计算框架MapReduce初体验(本地计算)

1.需求:在给定的文本文件中统计输出每一个名字出现的总次数

数据准备:

ttt.txt

zhangshan,lisi,wangwu,zhaoliu,
zhangshan,zhangshan,zhangshan,
zhangshan,wangwu,wangwu,
wangwu,zhaoliu,zhaoliu,
zhaoliu,zhangshan

pom 文件准备:

    <repositories>

        <repository>

            <id>cloudera</id>

            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>

        </repository>

    </repositories>

    <dependencies>

        <dependency>

            <groupId>org.apache.Hadoop</groupId>

            <artifactId>Hadoop-client</artifactId>

            <version>2.6.0-mr1-cdh5.14.0</version>

        </dependency>

        <dependency>

            <groupId>org.apache.Hadoop</groupId>

            <artifactId>Hadoop-common</artifactId>

            <version>2.6.0-cdh5.14.0</version>

        </dependency>

        <dependency>

            <groupId>org.apache.Hadoop</groupId>

            <artifactId>Hadoop-hdfs</artifactId>

            <version>2.6.0-cdh5.14.0</version>

        </dependency>

        <dependency>

            <groupId>org.apache.Hadoop</groupId>

            <artifactId>Hadoop-mapreduce-client-core</artifactId>

            <version>2.6.0-cdh5.14.0</version>

        </dependency>

        <dependency>

            <groupId>junit</groupId>

            <artifactId>junit</artifactId>

            <version>4.11</version>

            <scope>test</scope>

        </dependency>

        <dependency>

            <groupId>org.testng</groupId>

            <artifactId>testng</artifactId>

            <version>RELEASE</version>

        </dependency>

    </dependencies>

    <build>

        <plugins>

            <plugin>

                <groupId>org.apache.maven.plugins</groupId>

                <artifactId>maven-compiler-plugin</artifactId>

                <version>3.0</version>

                <configuration>

                    <source>1.8</source>

                    <target>1.8</target>

                    <encoding>UTF-8</encoding>

                </configuration>

            </plugin>

            <plugin>

                <groupId>org.apache.maven.plugins</groupId>

                <artifactId>maven-shade-plugin</artifactId>

                <version>2.4.3</version>

                <executions>

                    <execution>

                        <phase>package</phase>

                        <goals>

                            <goal>shade</goal>

                        </goals>

                        <configuration>

                            <minimizeJar>true</minimizeJar>

                        </configuration>

                    </execution>

                </executions>

            </plugin>

        </plugins>

    </build>

定义一个mapper类:

定义一个reducer类

定义一个主类,用来描述job并提交job:

发布了80 篇原创文章 · 获赞 168 · 访问量 8万+

猜你喜欢

转载自blog.csdn.net/weixin_44036154/article/details/103053844