1. 全分布式Hadoop启动和停止时，进程的启停顺序

[hadoop@hadoop sbin]$ ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hadoop.master]
hadoop.master: starting namenode, logging to /home/hadoop/software/hadoop-2.5.2/logs/hadoop-hadoop-namenode-hadoop.master.out
hadoop.slave1: starting datanode, logging to /home/hadoop/software/hadoop-2.5.2/logs/hadoop-hadoop-datanode-hadoop.slave1.out
hadoop.slave2: starting datanode, logging to /home/hadoop/software/hadoop-2.5.2/logs/hadoop-hadoop-datanode-hadoop.slave2.out
Starting secondary namenodes [hadoop.master]
hadoop.master: starting secondarynamenode, logging to /home/hadoop/software/hadoop-2.5.2/logs/hadoop-hadoop-secondarynamenode-hadoop.master.out
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/software/hadoop-2.5.2/logs/yarn-hadoop-resourcemanager-hadoop.master.out
hadoop.slave2: starting nodemanager, logging to /home/hadoop/software/hadoop-2.5.2/logs/yarn-hadoop-nodemanager-hadoop.slave2.out
hadoop.slave1: starting nodemanager, logging to /home/hadoop/software/hadoop-2.5.2/logs/yarn-hadoop-nodemanager-hadoop.slave1.out
[hadoop@hadoop sbin]$ jps
1974 ResourceManager
2233 Jps
1645 NameNode
1830 SecondaryNameNode
[hadoop@hadoop sbin]$ ./stop-all.sh 
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [hadoop.master]
hadoop.master: stopping namenode
hadoop.slave1: stopping datanode
hadoop.slave2: stopping datanode
Stopping secondary namenodes [hadoop.master]
hadoop.master: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
hadoop.slave1: stopping nodemanager
hadoop.slave2: stopping nodemanager
no proxyserver to stop

启动时的顺序：

Master启动NameNode
Master依次启动Slave的DataNode
Master启动Secondary Namenode
Master启动Yarn daemon进程
Master启动ResourceManager进程
Master依次启动Slave的NodeManage进程

启动结束后，

Master上有NameNode进程、SecondaryNameNode进程、ResourceManager进程

Slave上有DataNode进程，NodeManager进程

结束时的顺序：

Master停止NameNode
Master依次停止Slave的DataNode
Master停止Secondary Namenode进程
Master停止Yarn daemon进程
Master停止ResourceManager进程
Master依次停止Salve的NodeManager进程

可见，进程停止的顺序与进程启动的顺序完全一致

2. 基于Yarn的Map Reduce的系统组件

Client
ResourceManager
NodeManager
ApplicationMaster
Container
Scheduler

3. 每个组件的功能

3.0 Client

负责提交作业到集群

3.1 ResourceManager

整个集群只有一个
负责集群计算资源的分配与调度
处理客户端作业提交请求
启动/监控ApplicationMaster
监控NodeManager
(容错性)存在单点故障，基于ZooKeeper实现HA

3.2 NodeManager

整个集群有多个，负责单节点资源管理和使用（更细一点说，是负责启动、监控和管理该计算节点上Container，防止Application Master使用多于它申请到的计算资源）
单个节点上的资源管理和任务管理(因为NodeManager负责启动和监控管理Container，而ApplicationMaster和任务在Container中运行，因此Node Manager负责对它们使用的计算资源进行管理)
处理来自ResourceManager的命令
处理来自ApplicationMaster的命令
(容错性)NodeManager失败后，RM将失败任务告诉对应的AM，AM决定如何处理失败的任务

3.3 ApplicationMaster

每个应用有一个，负责应用程序整个生命周期的管理
分布式计算数据的切分
为应用程序向Resource Manager申请计算资源(以Container为单位,一个应用程序通常为申请跟任务数相同个数的Container)，并将Container分配给任务(实际上任务是在Container中执行的)
任务监控与容错
（容错性）失败后，由Resource Manager负责重启，Application Manager需处理内部任务的容错问题，ApplicationManager运行过程中会保存已经运行完成的Task，重启后无需重新运行

3.4 Container

是对任务运行环境的抽象
任务运行资源（节点、内存、CPU）
任务启动命令
任务运行环境

4. 基于Yarn的Map Reduce的总体过程

下图是Hadoop权威指南第三版描述的基于Yarn的Map Reduce过程

4.2 具体步骤

在运行作业之前，Resource Manager和Node Manager都已经启动，所以在上图中，Resource Manager进程和Node Manager进程不需要启动

1. 客户端进程通过runJob(实际中一般使用waitForCompletion提交作业)在客户端提交Map Reduce作业(在Yarn中，作业一般称为Application应用程序)
2. 客户端向Resource Manager申请应用程序ID(application id)，作为本次作业的唯一标识
3. 客户端程序将作业相关的文件(通常是指作业本身的jar包以及这个jar包依赖的第三方的jar），保存到HDFS上。也就是说Yarn based MR通过HDFS共享程序的jar包，供Task进程读取
4. 客户端通过runJob向ResourceManager提交应用程序
5.a/5.b. Resource Manager收到来自客户端的提交作业请求后，将请求转发给作业调度组件(Scheduler),Scheduler分配一个Container，然后Resource Manager在这个Container中启动Application Master进程，并交由Node Manager对Application Master进程进行管理
6. Application Master初始化作业(应用程序)，初始化动作包括创建监听对象以监听作业的执行情况，包括监听任务汇报的任务执行进度以及是否完成(不同的计算框架为集成到YARN资源调度框架中，都要提供不同的ApplicationMaster，比如Spark、Storm框架为了运行在Yarn之上，它们都提供了ApplicationMaster)
7. Application Master根据作业代码中指定的数据地址(数据源一般来自HDFS)进行数据分片，以确定Mapper任务数，具体每个Mapper任务发往哪个计算节点，Hadoop会考虑数据本地性，本地数据本地性、本机架数据本地性以及最后跨机架数据本地性)。同时还会计算Reduce任务数，Reduce任务数是在程序代码中指定的，通过job.setNumReduceTask显式指定的
8.如下几点是Application Master向Resource Manager申请资源的细节
8.1 Application Master根据数据分片确定的Mapper任务数以及Reducer任务数向Resource Manager申请计算资源(计算资源主要指的是内存和CPU，在Hadoop Yarn中，使用Container这个概念来描述计算单位，即计算资源是以Container为单位的，一个Container包含一定数量的内存和CPU内核数)。
8.2 Application Master是通过向Resource Manager发送Heart Beat心跳包进行资源申请的，申请时，请求中还会携带任务的数据本地性等信息，使得Resource Manager在分配资源时，不同的Task能够分配到的计算资源尽可能满足数据本地性
8.3 Application Master向Resource Manager资源申请时，还会携带内存数量信息，默认情况下，Map任务和Reduce任务都会分陪1G内存，这个值是可以通过参数mapreduce.map.memory.mb and mapreduce.reduce.memory.mb进行修改。

[Hadoop权威指南第六章]

The way memory is allocated is different from MapReduce 1, where tasktrackers have a fixed number of “slots,” set at cluster configuration time, and each task runs in a single slot. Slots have a maximum memory allowance, which again is fixed for a cluster, leading to both problems of underutilization when tasks use less memory (because other waiting tasks are not able to take advantage of the unused memory) and problems of job failure when a task can’t complete since it can’t get enough memory to run correctly and therefore can’t complete.

In YARN, resources are more fine-grained, so both of these problems can be avoided.In particular, applications may request a memory capability that is anywhere between the minimum allocation and a maximum allocation, and that must be a multiple of the minimum allocation. Default memory allocations are scheduler-specific, and for the capacity scheduler, the default minimum is 1024 MB (set by yarn.scheduler.capacity .minimum-allocation-mb) and the default maximum is 10240 MB (set by yarn.schedu ler.capacity.maximum-allocation-mb). Thus, tasks can request any memory allocation between 1 and 10 GB (inclusive), in multiples of 1 GB (the scheduler will round up to the nearest multiple if needed), by setting mapreduce.map.memory.mb and mapre duce.reduce.memory.mb appropriately.

9a. Application Master申请到计算资源后(由Resource Manager的Scheduler负责分配)，Application Manager通知Node Manager启动一个Container，这些Container用于执行Map任务或者Reduce任务
9b.启动Container后，即开始执行Map任务或者Reduce任务，任务是在称为YarnChild进程中运行，不同于Map Reduce 1，Yarn上每个任务都会启动一个新的JVM进程，
10.每个任务从HDFS上获取分区数据，以完成任务计算

【Hadoop六】基于Yarn的Hadoop Map Reduce工作流程