关于Yarn源码那些事-番外-Yarn的状态机

上一篇番外,写了Yarn的调度机制,这种大型的调度,主要用于类与类之间的调度,简单来说,就是一个类对于那些觉得自己管控不了的事情,就将其扔给别人来处理。

Yarn中很多的类都是有调度器的,而且它们大多都是持有全局调度器,对于自己处理不了的事情,干脆利落地将其扔到RM或者NM的调度器中。

全局调度器就相当于一个统筹者,一根线,把所有的类给串起来,而在很多类的内部,同样支持事件的处理,而本文就讨论下这种处理机制,状态机。

声明:本文有些图片来自于董老师的博客和书,所以给董老师的博客打个广告:

董的博客

接下来,我们认真讨论下状态机;先上两个类:

1:StateMachine

public interface StateMachine<STATE extends Enum<STATE>, EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {
	public STATE getCurrentState();
	public STATE doTransition(EVENTTYPE eventType, EVENT event)
			throws InvalidStateTransitonException;
}

这是状态机的基本定义,一个接口,两个方法,getCurrentState获取状态机的当前状态,doTransition执行状态转化;具体分析具体的类是如何调用的。

2:StateMachineFactory

* @param <OPERAND>
 *            The object type on which this state machine operates.
 * @param <STATE>
 *            The state of the entity.
 * @param <EVENTTYPE>
 *            The external eventType to be handled.
 * @param <EVENT>
 *            The event object.
 *
 */
@Public
@Evolving
final public class StateMachineFactory<OPERAND, STATE extends Enum<STATE>, EVENTTYPE extends Enum<EVENTTYPE>, EVENT> 

注释很简单,OPERAND代表该状态机所在的实体,STATE代表该实体内当前状态机的状态,EVENTYPE代表需要处理的事件的类型,EVENT代表事件对象。

顾名思义,状态机工厂,负责生产状态机的类,这里,我们首先介绍成员变量,其中的方法,在具体分析某个类的状态机再讲述:

private final TransitionsListNode transitionsListNode;//转换的链表,承载了转变过程
private Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>> stateMachineTable;//遇到再细说
private STATE defaultInitialState;//初始状态
private final boolean optimized;//遇到再说

接下来,我们找个状态机来分析一把

3:从RMAppAttemptImpl来说状态机:

	private final StateMachine<RMAppAttemptState, RMAppAttemptEventType, RMAppAttemptEvent> stateMachine;

其状态机的定义不多说:

从StateMachineFactory开始看,其状态机工厂的代码非常复杂,看最初的一段:

private static final StateMachineFactory<RMAppAttemptImpl, RMAppAttemptState, 
	RMAppAttemptEventType, RMAppAttemptEvent> stateMachineFactory = 
	new StateMachineFactory<RMAppAttemptImpl, RMAppAttemptState, RMAppAttemptEventType, RMAppAttemptEvent>(
			RMAppAttemptState.NEW)

从StateMachineFactory定义的角度来看看,这个状态机工厂的初始化定义:

  1. RMAppAttemptImpl:本状态机工厂针对的实体类,状态机发生的转变所触及的操作,都是针对于该实体类的
  2. RMAppAttemptState:其实就是RMAppAttemptImpl当前的状态
  3. RMAppAttemptEventType:会触发状态机工厂或者状态机状态改变的事件类型,落到本类上,就是RMAppAttemptStateType,这种事件类型会触发RMAppAttemptImpl的改变
  4. RMAppAttemptEvent:代表触发的事件对象,

状态机工厂的初始化:

	public StateMachineFactory(STATE defaultInitialState) {
		this.transitionsListNode = null;
		this.defaultInitialState = defaultInitialState;
		this.optimized = false;
		this.stateMachineTable = null;
	}

对照来看,发现我们这里新建的状态机工厂,默认状态是RMAppAttemptState.NEW,后续的状态转换,都会建立在这个初始状态基础之上,即这里的STATE已经变成了RMAppAttemptState.NEW:

接着看:

addTransition(RMAppAttemptState.NEW, RMAppAttemptState.SUBMITTED,
					RMAppAttemptEventType.START, new AttemptStartedTransition())

这是调用的StateMachineFactory的方法,看看:

/**
	 * @return a NEW StateMachineFactory just like {@code this} with the current
	 *         transition added as a new legal transition
	 *
	 *         Note that the returned StateMachineFactory is a distinct object.
	 *
	 *         This method is part of the API.
	 *
	 * @param preState
	 *            pre-transition state
	 * @param postState
	 *            post-transition state
	 * @param eventType
	 *            stimulus for the transition
	 * @param hook
	 *            transition hook
	 */
	public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
			STATE preState, STATE postState, EVENTTYPE eventType,
			SingleArcTransition<OPERAND, EVENT> hook) {
		return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(
				this,
				new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>(
						preState, eventType, new SingleInternalArc(postState,
								hook)));
	}

我们看下addTransition的方法,在StateMachineFactory内,有很多重载的addTransition方法,我们一个个看,先看本方法,这牵涉到了一个新状态机的初始化,并且新建了一个ApplicableSingleOrMultiTransition,我们拆开看看:

先看这个初始化方法:

private StateMachineFactory(
			StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
			ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> t) {
		this.defaultInitialState = that.defaultInitialState;
		this.transitionsListNode = new TransitionsListNode(t,
				that.transitionsListNode);
		this.optimized = false;
		this.stateMachineTable = null;
	}

就RMAppAttemptImpl内部的这个状态机工厂来说,其defaultState是RMAppAttemptState.NEW,而我们调用addTransition方法传入的:

prestate : RMAppAttemptState.NEW
poststate : RMAppAttemptState.SUBMITTED
eventType : RMAppAttemptState.START

参数先放在这里,我们看看addTransition传入的第二个参数,目测是一个单边或多边转换,看看内容:

ApplicableSingleOrMultipleTransition(STATE preState,
				EVENTTYPE eventType,
				Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition) {
			this.preState = preState;
			this.eventType = eventType;
			this.transition = transition;
		}

初始化很简单,也很明确,三个成员变量,而实际上,这里代表的意思是,preState代表状态机的转换前状态,eventType代表将会触发事件:

我们重点看下第三个SingleInternalArc:

private class SingleInternalArc implements
			Transition<OPERAND, STATE, EVENTTYPE, EVENT> {
		private STATE postState;
		private SingleArcTransition<OPERAND, EVENT> hook; // transition hook
		SingleInternalArc(STATE postState,
				SingleArcTransition<OPERAND, EVENT> hook) {
			this.postState = postState;
			this.hook = hook;
		}
	}

我们看看这个初始化,非常简单的变量赋值,但是牵涉到一个hook,一个转换的钩子:

/**
 * Hook for Transition. This lead to state machine to move to the post state as
 * registered in the state machine.
 */
@Public
@Evolving
public interface SingleArcTransition<OPERAND, EVENT> {
	/**
	 * Transition hook.
	 * 
	 * @param operand
	 *            the entity attached to the FSM, whose internal state may
	 *            change.
	 * @param event
	 *            causal event
	 */
	public void transition(OPERAND operand, EVENT event);
}

注释非常简单,我们能看出来,这个钩子帮助我们对状态机的状态进行转换,就好比如,触发了事件event,然后对应对象的状态机的状态会发生改变。

到这里,我们把参数向上进行递推:

我们这里定义的SingleArcTransition<OPERAND,EVENT>,实际传入的OPERAND是RMAppAttemptImpl,传入的事件对象是:RMAppAttemptEvent,我们可以看到:

private static final class AttemptStartedTransition extends BaseTransition {
		@Override
		public void transition(RMAppAttemptImpl appAttempt,
				RMAppAttemptEvent event) {
			boolean transferStateFromPreviousAttempt = false;
			if (event instanceof RMAppStartAttemptEvent) {
				transferStateFromPreviousAttempt = ((RMAppStartAttemptEvent) event)
						.getTransferStateFromPreviousAttempt();
			}
			appAttempt.startTime = System.currentTimeMillis();
			// Register with the ApplicationMasterService
			appAttempt.masterService
					.registerAppAttempt(appAttempt.applicationAttemptId);
			if (UserGroupInformation.isSecurityEnabled()) {
				appAttempt.clientTokenMasterKey = appAttempt.rmContext
						.getClientToAMTokenSecretManager().createMasterKey(
								appAttempt.applicationAttemptId);
			}
			// Add the applicationAttempt to the scheduler and inform the
			// scheduler
			// whether to transfer the state from previous attempt.
			appAttempt.eventHandler.handle(new AppAttemptAddedSchedulerEvent(
					appAttempt.applicationAttemptId,
					transferStateFromPreviousAttempt));
		}
	}

直接从AttemptedStartedTransition开始看,比较清晰明了,其逻辑我们先不分析;这是个继承了BaseTransition的类:

private static class BaseTransition implements
			SingleArcTransition<RMAppAttemptImpl, RMAppAttemptEvent> {
		@Override
		public void transition(RMAppAttemptImpl appAttempt,
				RMAppAttemptEvent event) {
		}
	}

而实际上BaseTransition是继承了SingleArcTransition的,在这里,我们找到了hook所对应的OPERAND和本次的事件对象:

从而,我们得到了上文传入的ApplicableSingleOrMultipleTransition,其类型对应的四个参数是:

OPERAND : RMAppAttemptImpl
STATE : RMAppAttemptState.NEW
EVENTTYPE : RMAppAttemptEventType.START
EVENT : RMAppAttemptEvent

好的,再往上推一层:

new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(
				this,
				new ApplicableSingleOrMultipleTransition<OPERAND, STATE, EVENTTYPE, EVENT>(
						preState, eventType, new SingleInternalArc(postState,
								hook)))
private StateMachineFactory(
			StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that,
			ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> t) {
		this.defaultInitialState = that.defaultInitialState;
		this.transitionsListNode = new TransitionsListNode(t,
				that.transitionsListNode);
		this.optimized = false;
		this.stateMachineTable = null;
	}

看看我们这次新建的状态机工厂是什么样的:

  1. 默认状态保持不变
  2. 新建了一个transitionListNode节点,原先的节点是个Null,所以我们看看这个transitionListNode的构造方法:
TransitionsListNode(
				ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT> transition,
				TransitionsListNode next) {
			this.transition = transition;
			this.next = next;
		}

本质上,就相当于建立了一个头结点的链表,就我们目前的这种调用来说,新建的TransitionsListNode的成员变量:transition为传入的那个转换,next则是Null。

我们返回的状态机工厂也清晰明了,其他都没变化,主要是其中的transitionListNode的头结点变成了此次新增的一个转换,即一个Transition:

而这个transition的定义呢:是定义了OPERAND(RMAppAttemptImpl)从STATE(RMAppAttemptState.NEW),如果有外部事件触发,并且事件触发类型(EVENTTYPE)是RMAppAttemptEventType.START类型,则会引发一次操作,并使得状态机工厂的状态更新为:RMAppAttemptState.SUBMITTED。

注意,这里提及的是一种单边转换,我们直接以参数顺序来说明吧:

  1. 当前状态
  2. 转变之后的状态
  3. 触发本次转换的事件类型
  4. 触发转换同时触发的操作

每次调用addTransition方法之后,都会给原先的状态机加上一种转化,这种转化指定了prestate,poststate,触发转换的类型,本次转换对于状态机所在对象的操作。

前文提过,addTransition方法有五种重载,目前先介绍这一种,为了把我们整体的状态机的使用原理先介绍完毕,后续会把余下四种方式介绍完毕。

.addTransition(RMAppAttemptState.KILLED, RMAppAttemptState.KILLED,
							RMAppAttemptEventType.CONTAINER_FINISHED, new ContainerFinishedAtFinalStateTransition())
					.installTopology();

这是状态机工厂创建的最后一句,前面的分析让我们明白,每次调用addTransition方法,其实就是给状态机工厂内部持有的transitionListNode添加了一个转换,这个转换如上的定义,而这个转换每次是使用头插法,不断插入到transitionListNode的头结点位置。

最后,我们看看installTopology方法:

public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> installTopology() {
		return new StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT>(this, true);
	}
private StateMachineFactory(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> that, boolean optimized) {
		this.defaultInitialState = that.defaultInitialState;
		this.transitionsListNode = that.transitionsListNode;
		this.optimized = optimized;
		if (optimized) {
			makeStateMachineTable();
		} else {
			stateMachineTable = null;
		}
	}

因为我们这里定义了optimized为true,所以走的是makeStateMachineTable方法:

private void makeStateMachineTable() {
		Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>> stack = new Stack<ApplicableTransition<OPERAND, STATE, EVENTTYPE, EVENT>>();

		Map<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>> prototype = new HashMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>();

		prototype.put(defaultInitialState, null);

		// I use EnumMap here because it'll be faster and denser. I would
		// expect most of the states to have at least one transition.
		stateMachineTable = new EnumMap<STATE, Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>>(prototype);

		for (TransitionsListNode cursor = transitionsListNode; cursor != null; cursor = cursor.next) {
			stack.push(cursor.transition);
		}

		while (!stack.isEmpty()) {
			stack.pop().apply(this);
		}
	}

我们认真看看这个方法,很清晰,从头结点开始遍历,然后把所有的转换都放入stack内,stack作为栈,是先进后出的,然后再进行出栈,执行apply方法,那我们看看apply方法:

@Override
		public void apply(StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> subject) {
			Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap = subject.stateMachineTable
					.get(preState);
			if (transitionMap == null) {
				// I use HashMap here because I would expect most EVENTTYPE's to not
				// apply out of a particular state, so FSM sizes would be
				// quadratic if I use EnumMap's here as I do at the top level.
				transitionMap = new HashMap<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>>();
				subject.stateMachineTable.put(preState, transitionMap);
			}
			transitionMap.put(eventType, transition);
		}

必须注意,这里的subject,其实传入的就是当前状态机工厂,而默认情况下我们倒着把前面进行的转换扔进来,会发现,第一次提交的preState其实是最后放入的prestate,实质上就是:RMAppAttemptState.KILLED。

所以,毫无疑问,第一次取出的transitionMap是Null,然后新建了一个transitionMap,这时候依旧是个Null,最后一句代码给transitionMap赋值,key为RMAppAttemptState.KILLED,其value为对应的一次转化操作。

这样不断持续下去,最终形成了的结构其实是两层结构:

  1. stateMachineTable的key是一种随机的初始状态
  2. value是一个Map,定义触发转化事件的事件类型为key,对应的转化为value。

一直到所有操作结束:

但是,直到现在,我们还没看到我们此次状态机的诞生,这里说的都是状态机工厂而已,别着急,这就来:

4:状态机

		this.stateMachine = stateMachineFactory.make(this);

毫无疑问,我们要看看make方法做了什么:

public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
		return new InternalStateMachine(operand, defaultInitialState);
	}
private class InternalStateMachine implements StateMachine<STATE, EVENTTYPE, EVENT> {
		private final OPERAND operand;
		private STATE currentState;

		InternalStateMachine(OPERAND operand, STATE initialState) {
			this.operand = operand;
			this.currentState = initialState;
			if (!optimized) {
				maybeMakeStateMachineTable();
			}
		}

毫无疑问,我们必须要看看,这里的optimized到底是不是false,果然,的确初始化的时候是false:

private synchronized void maybeMakeStateMachineTable() {
		if (stateMachineTable == null) {
			makeStateMachineTable();
		}
	}

但实际上,如果发现是空的话,还是执行我们前面执行过的逻辑,所以,返回的InternalStateMachine其实两个成员变量分别为:

  1. OPERAND : 状态机对应的实例对象,注意,是实例对象
  2. defaultState : 初始状态

看起来就到这儿了,我们还得继续看看其调用逻辑,才知道到底这个状态机是怎么起到我们想要的效果的:

其实调用的代码在状态机内部:

@Override
public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event) throws InvalidStateTransitonException {
	currentState = StateMachineFactory.this.doTransition(operand, currentState, eventType, event);
	return currentState;
}

大家可以看到,这里调用的其实是一个StateMachineFactory内部InternalStateMachine的方法。

还是通俗地说吧,创建状态机的过程,其实就是在StateMachineFactory内部的InternalMachine内部创建了一个自己的Map,其key就是对象自身,对应的Map中,则存储的是<state,transition>,而transition中指定了从当前状态开始,对指定的操作对象的一种转换操作,而对应的对象,会把其自身的状态更换为本次转换返回的类型。

让我们切实看看一次转换的操作吧:

/**
	 * Effect a transition due to the effecting stimulus.
	 * 
	 * @param state
	 *            current state
	 * @param eventType
	 *            trigger to initiate the transition
	 * @param cause
	 *            causal eventType context
	 * @return transitioned state
	 */
	private STATE doTransition(OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
			throws InvalidStateTransitonException {
		// We can assume that stateMachineTable is non-null because we call
		// maybeMakeStateMachineTable() when we build an InnerStateMachine ,
		// and this code only gets called from inside a working InnerStateMachine .
		Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap = stateMachineTable.get(oldState);
		if (transitionMap != null) {
			Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition = transitionMap.get(eventType);
			if (transition != null) {
				return transition.doTransition(operand, oldState, event, eventType);
			}
		}
		throw new InvalidStateTransitonException(oldState, eventType);
	}

代码注释就很明确了,实现的目的就是,转换操作,以及转换后的状态赋值。

状态机必须从源码角度去理解清楚,这样才能够在看源码时候得心应手,这是我的深切体会,因为源码中用到状态机的地方是在太多太多了。

women

猜你喜欢

转载自blog.csdn.net/u013384984/article/details/80327966
今日推荐