关于Yarn源码的那些事(四)

紧接着系列(三)。

前面的介绍,基本都比较浅显易懂,讲述了Yarn的Client提交新的Application给ResourceManager,后者返回唯一的ID。

本文索要讲的,是RM端如何把Application的ApplicationMaster给启动起来的。

力求通俗易懂,但是看起来没那么容易。

追溯下来,我们发现第二次提交Application的逻辑,是由YarnRunner来实现的,实现如下:

@Override
	public JobStatus submitJob(JobID jobId, String jobSubmitDir, Credentials ts)
			throws IOException, InterruptedException {
		addHistoryToken(ts);
		// Construct necessary information to start the MR AM
		ApplicationSubmissionContext appContext = createApplicationSubmissionContext(
				conf, jobSubmitDir, ts);

		// Submit to ResourceManager
		try {
			ApplicationId applicationId = resMgrDelegate
					.submitApplication(appContext);

			ApplicationReport appMaster = resMgrDelegate
					.getApplicationReport(applicationId);
			String diagnostics = (appMaster == null ? "application report is null"
					: appMaster.getDiagnostics());
			if (appMaster == null
					|| appMaster.getYarnApplicationState() == YarnApplicationState.FAILED
					|| appMaster.getYarnApplicationState() == YarnApplicationState.KILLED) {
				throw new IOException("Failed to run job : " + diagnostics);
			}
			return clientCache.getClient(jobId).getJobStatus(jobId);
		} catch (YarnException e) {
			throw new IOException(e);
		}
	}

接着,看下resMgrDelegate中的submitApplication方法,其实际上是由YarnClientImpl中的方法实现的:

@Override
	public ApplicationId submitApplication(
			ApplicationSubmissionContext appContext) throws YarnException,
			IOException {
		ApplicationId applicationId = appContext.getApplicationId();
		if (applicationId == null) {
			throw new ApplicationIdNotProvidedException(
					"ApplicationId is not provided in ApplicationSubmissionContext");
		}
		SubmitApplicationRequest request = Records
				.newRecord(SubmitApplicationRequest.class);
		request.setApplicationSubmissionContext(appContext);

		// Automatically add the timeline DT into the CLC
		// Only when the security and the timeline service are both enabled
		if (isSecurityEnabled() && timelineServiceEnabled) {
			addTimelineDelegationToken(appContext.getAMContainerSpec());
		}

		// TODO: YARN-1763:Handle RM failovers during the submitApplication
		// call.
		rmClient.submitApplication(request);

		int pollCount = 0;
		long startTime = System.currentTimeMillis();

		while (true) {
			try {
				YarnApplicationState state = getApplicationReport(applicationId)
						.getYarnApplicationState();
				if (!state.equals(YarnApplicationState.NEW)
						&& !state.equals(YarnApplicationState.NEW_SAVING)) {
					LOG.info("Submitted application " + applicationId);
					break;
				}

				long elapsedMillis = System.currentTimeMillis() - startTime;
				if (enforceAsyncAPITimeout()
						&& elapsedMillis >= asyncApiPollTimeoutMillis) {
					throw new YarnException(
							"Timed out while waiting for application "
									+ applicationId
									+ " to be submitted successfully");
				}

				// Notify the client through the log every 10 poll, in case the
				// client
				// is blocked here too long.
				if (++pollCount % 10 == 0) {
					LOG.info("Application submission is not finished, "
							+ "submitted application " + applicationId
							+ " is still in " + state);
				}
				try {
					Thread.sleep(submitPollIntervalMillis);
				} catch (InterruptedException ie) {
					LOG.error("Interrupted while waiting for application "
							+ applicationId + " to be successfully submitted.");
				}
			} catch (ApplicationNotFoundException ex) {
				// FailOver or RM restart happens before RMStateStore saves
				// ApplicationState
				LOG.info("Re-submit application " + applicationId + "with the "
						+ "same ApplicationSubmissionContext");
				rmClient.submitApplication(request);
			}
		}

		return applicationId;
	}

注意其中这一句:

		rmClient.submitApplication(request);

request是包含了我们服务整体参数以及运行脚本的对象,我们提交给RM。

接下来看看RM端的实现:

			// call RMAppManager to submit application directly
			rmAppManager.submitApplication(submissionContext,
					System.currentTimeMillis(), user);

			LOG.info("Application with id " + applicationId.getId()
					+ " submitted by user " + user);
			RMAuditLogger.logSuccess(user, AuditConstants.SUBMIT_APP_REQUEST,
					"ClientRMService", applicationId);
		

重要的代码在这儿,由rmAppManager提交了任务,注意,这里是RM的逻辑,其实就是RM提交的RPC请求,请求对应的NodeManager来启动对应的container,我们点进去看看:

@SuppressWarnings("unchecked")
	protected void submitApplication(
			ApplicationSubmissionContext submissionContext, long submitTime,
			String user) throws YarnException {
		ApplicationId applicationId = submissionContext.getApplicationId();

		RMAppImpl application = createAndPopulateNewRMApp(submissionContext,
				submitTime, user, false);
		ApplicationId appId = submissionContext.getApplicationId();

		if (UserGroupInformation.isSecurityEnabled()) {
			try {
				this.rmContext.getDelegationTokenRenewer().addApplicationAsync(
						appId, parseCredentials(submissionContext),
						submissionContext.getCancelTokensWhenComplete(),
						application.getUser());
			} catch (Exception e) {
				LOG.warn("Unable to parse credentials.", e);
				// Sending APP_REJECTED is fine, since we assume that the
				// RMApp is in NEW state and thus we haven't yet informed the
				// scheduler about the existence of the application
				assert application.getState() == RMAppState.NEW;
				this.rmContext
						.getDispatcher()
						.getEventHandler()
						.handle(new RMAppRejectedEvent(applicationId, e
								.getMessage()));
				throw RPCUtil.getRemoteException(e);
			}
		} else {
			// Dispatcher is not yet started at this time, so these START events
			// enqueued should be guaranteed to be first processed when
			// dispatcher
			// gets started.
			this.rmContext
					.getDispatcher()
					.getEventHandler()
					.handle(new RMAppEvent(applicationId, RMAppEventType.START));
		}
	}

我们仔细看下这段代码,先看下createAndPopulateNewRMApp方法,具体代码不说,挑选其中一部分:

RMAppImpl application = new RMAppImpl(applicationId, rmContext,
				this.conf, submissionContext.getApplicationName(), user,
				submissionContext.getQueue(), submissionContext,
				this.scheduler, this.masterService, submitTime,
				submissionContext.getApplicationType(),
				submissionContext.getApplicationTags(), amReq);

这里,新建了一个RMAppImpl,我们仔细看下这个类,其中封装了一个状态机,这是Yarn机制的一个重大改进,每个服务都随着状态的不断改变而进行操作,具体可以参考下设计模式中的状态模式,原理相同。

看下什么是状态机:

/**
 * State machine topology. This object is semantically immutable. If you have a
 * StateMachineFactory there's no operation in the API that changes its semantic
 * properties.
 *
 * @param <OPERAND>
 *            The object type on which this state machine operates.
 * @param <STATE>
 *            The state of the entity.
 * @param <EVENTTYPE>
 *            The external eventType to be handled.
 * @param <EVENT>
 *            The event object.
 *
 */
@Public
@Evolving
final public class StateMachineFactory<OPERAND, STATE extends Enum<STATE>, EVENTTYPE extends Enum<EVENTTYPE>, EVENT> {

注释非常详细,不说了,我们看下我们用到的addTransition方法:

/**
	 * @return a NEW StateMachineFactory just like {@code this} with the current
	 *         transition added as a new legal transition. This overload has no
	 *         hook object.
	 *
	 *         Note that the returned StateMachineFactory is a distinct object.
	 *
	 *         This method is part of the API.
	 *
	 * @param preState
	 *            pre-transition state
	 * @param postState
	 *            post-transition state
	 * @param eventType
	 *            stimulus for the transition
	 */
	public StateMachineFactory<OPERAND, STATE, EVENTTYPE, EVENT> addTransition(
			STATE preState, STATE postState, EVENTTYPE eventType) {
		return addTransition(preState, postState, eventType, null);
	}

每次都规定了当前状态,触发的事件,以及时间触发之后的状态,而实际处理逻辑,则是由其他类来实现的。

我们继续看,还是RMAppImpl的初始化部分,其中传入了一个rmContext,我在RM初始化和服务启动的博客里提到,这是RM的大管家,封装了很多相关的服务,而这里,就是把RMAppImpl和RM联系上。

这里再说下,本系列基于2.6.5版本的hadoop,而关于resourceManager系列文章是基于2.2.0版本的hadoop,代码上有些区别,但基本原理是相同的。

这里就牵涉到其中一个区别:

		createAndInitActiveServices();

在2.2.0的代码中,RM启动的时候没有这句,代码比较分散,而这里,对于rmContext的初始化,基本都整合在该方法内:

/**
	 * Helper method to create and init {@link #activeServices}. This creates an
	 * instance of {@link RMActiveServices} and initializes it.
	 * 
	 * @throws Exception
	 */
	protected void createAndInitActiveServices() throws Exception {
		activeServices = new RMActiveServices(this);
		activeServices.init(conf);
	}

可以看下,其中新建了一个RMActiveServices,并进行初始化,顾名思义,对于所有提交的活着的ApplicationMaster,均交给这个类来进行处理,这个类的初始化代码在此不多说,粘贴于下:

			activeServiceContext = new RMActiveServiceContext();
			rmContext.setActiveServiceContext(activeServiceContext);
			conf.setBoolean(Dispatcher.DISPATCHER_EXIT_ON_ERROR_KEY, true);
			rmSecretManagerService = createRMSecretManagerService();
			addService(rmSecretManagerService);
			containerAllocationExpirer = new ContainerAllocationExpirer(rmDispatcher);
			addService(containerAllocationExpirer);
			rmContext.setContainerAllocationExpirer(containerAllocationExpirer);
			AMLivelinessMonitor amLivelinessMonitor = createAMLivelinessMonitor();
			addService(amLivelinessMonitor);
			rmContext.setAMLivelinessMonitor(amLivelinessMonitor);
			AMLivelinessMonitor amFinishingMonitor = createAMLivelinessMonitor();
			addService(amFinishingMonitor);
			rmContext.setAMFinishingMonitor(amFinishingMonitor);
			RMNodeLabelsManager nlm = createNodeLabelManager();
			nlm.setRMContext(rmContext);
			addService(nlm);
			rmContext.setNodeLabelManager(nlm);
			boolean isRecoveryEnabled = conf.getBoolean(YarnConfiguration.RECOVERY_ENABLED,
					YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED);

			RMStateStore rmStore = null;
			if (isRecoveryEnabled) {
				recoveryEnabled = true;
				rmStore = RMStateStoreFactory.getStore(conf);
				boolean isWorkPreservingRecoveryEnabled = conf.getBoolean(
						YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED,
						YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED);
				rmContext.setWorkPreservingRecoveryEnabled(isWorkPreservingRecoveryEnabled);
			} else {
				recoveryEnabled = false;
				rmStore = new NullRMStateStore();
			}

			try {
				rmStore.init(conf);
				rmStore.setRMDispatcher(rmDispatcher);
				rmStore.setResourceManager(rm);
			} catch (Exception e) {
				// the Exception from stateStore.init() needs to be handled for
				// HA and we need to give up master status if we got fenced
				LOG.error("Failed to init state store", e);
				throw e;
			}
			rmContext.setStateStore(rmStore);

			if (UserGroupInformation.isSecurityEnabled()) {
				delegationTokenRenewer = createDelegationTokenRenewer();
				rmContext.setDelegationTokenRenewer(delegationTokenRenewer);
			}

			// Register event handler for NodesListManager
			nodesListManager = new NodesListManager(rmContext);
			rmDispatcher.register(NodesListManagerEventType.class, nodesListManager);
			addService(nodesListManager);
			rmContext.setNodesListManager(nodesListManager);

			// Initialize the scheduler
			scheduler = createScheduler();
			scheduler.setRMContext(rmContext);
			addIfService(scheduler);
			rmContext.setScheduler(scheduler);

			schedulerDispatcher = createSchedulerEventDispatcher();
			addIfService(schedulerDispatcher);
			rmDispatcher.register(SchedulerEventType.class, schedulerDispatcher);

			// Register event handler for RmAppEvents
			rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(rmContext));

			// Register event handler for RmAppAttemptEvents
			rmDispatcher.register(RMAppAttemptEventType.class, new ApplicationAttemptEventDispatcher(rmContext));

			// Register event handler for RmNodes
			rmDispatcher.register(RMNodeEventType.class, new NodeEventDispatcher(rmContext));

			nmLivelinessMonitor = createNMLivelinessMonitor();
			addService(nmLivelinessMonitor);

			resourceTracker = createResourceTrackerService();
			addService(resourceTracker);
			rmContext.setResourceTrackerService(resourceTracker);

			DefaultMetricsSystem.initialize("ResourceManager");
			JvmMetrics.initSingleton("ResourceManager", null);

			// Initialize the Reservation system
			if (conf.getBoolean(YarnConfiguration.RM_RESERVATION_SYSTEM_ENABLE,
					YarnConfiguration.DEFAULT_RM_RESERVATION_SYSTEM_ENABLE)) {
				reservationSystem = createReservationSystem();
				if (reservationSystem != null) {
					reservationSystem.setRMContext(rmContext);
					addIfService(reservationSystem);
					rmContext.setReservationSystem(reservationSystem);
					LOG.info("Initialized Reservation system");
				}
			}

			// creating monitors that handle preemption
			createPolicyMonitors();

			masterService = createApplicationMasterService();
			addService(masterService);
			rmContext.setApplicationMasterService(masterService);

			applicationACLsManager = new ApplicationACLsManager(conf);

			queueACLsManager = createQueueACLsManager(scheduler, conf);

			rmAppManager = createRMAppManager();
			// Register event handler for RMAppManagerEvents
			rmDispatcher.register(RMAppManagerEventType.class, rmAppManager);

			clientRM = createClientRMService();
			addService(clientRM);
			rmContext.setClientRMService(clientRM);

			applicationMasterLauncher = createAMLauncher();
			rmDispatcher.register(AMLauncherEventType.class, applicationMasterLauncher);

			addService(applicationMasterLauncher);
			if (UserGroupInformation.isSecurityEnabled()) {
				addService(delegationTokenRenewer);
				delegationTokenRenewer.setRMContext(rmContext);
			}

			new RMNMInfo(rmContext, scheduler);

			super.serviceInit(conf);
		

有点多,但是可以看出来,基本逻辑没有变化,只是把代码放置的更加集中了,属于重构方面的优化,功能没多大改变。

接着看上文异步提交的代码,追溯到 DelegationTokenRenewer中DelegationTokenRenewerRunnable的run方法,异步执行:

			if (evt instanceof DelegationTokenRenewerAppSubmitEvent) {
				DelegationTokenRenewerAppSubmitEvent appSubmitEvt = (DelegationTokenRenewerAppSubmitEvent) evt;
				handleDTRenewerAppSubmitEvent(appSubmitEvt);
			} else if (evt.getType().equals(DelegationTokenRenewerEventType.FINISH_APPLICATION)) {
				DelegationTokenRenewer.this.handleAppFinishEvent(evt);
			}
		

很明显,我们提交的是一种类型的事件,继续看handleDTRnnewerAppSubmitEvent方法:

				// Setup tokens for renewal
				DelegationTokenRenewer.this.handleAppSubmitEvent(event);
				rmContext.getDispatcher().getEventHandler()
						.handle(new RMAppEvent(event.getApplicationId(), RMAppEventType.START));
			

主要逻辑如上,第一个方法不多说,主要是提交相应的token,我们看第二个,清晰明了,是进行事件的调度,这里封装了一个事件,叫做RMAppEvent,该类没有注释,其实主要就是ApplicationMaster相关的状态,我们可以直接看RMAppEventType:

public enum RMAppEventType {
	// Source: ClientRMService
	START, RECOVER, KILL, MOVE, // Move app to a new queue

	// Source: Scheduler and RMAppManager
	APP_REJECTED,

	// Source: Scheduler
	APP_ACCEPTED,

	// Source: RMAppAttempt
	ATTEMPT_REGISTERED, ATTEMPT_UNREGISTERED, ATTEMPT_FINISHED, // Will send the final state
	ATTEMPT_FAILED, ATTEMPT_KILLED, NODE_UPDATE,

	// Source: Container and ResourceTracker
	APP_RUNNING_ON_NODE,

	// Source: RMStateStore
	APP_NEW_SAVED, APP_UPDATE_SAVED,
}

枚举类,主要定义了各种状态,注释解释了都应该由谁来进行事件的处理,所以,这里是把一个RMAppEventType.START的事件交给了全局调度器RMDispatcher,放入自己的队列中,等待其他类来进行处理。

调度给谁呢,这就牵涉到RM中的createAndInitActiveService代码,发现该事件是由ApplicationEventDispatcher来handle的,继续看代码:

			rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(rmContext));
@Override
		public void handle(RMAppEvent event) {
			ApplicationId appID = event.getApplicationId();
			RMApp rmApp = this.rmContext.getRMApps().get(appID);
			if (rmApp != null) {
				try {
					rmApp.handle(event);
				} catch (Throwable t) {
					LOG.error("Error in handling event type " + event.getType() + " for application " + appID, t);
				}
			}
		}

最后,交给了RMApp处理,其实现类为RMAppImpl,看其中的handle方法:

@Override
	public void handle(RMAppEvent event) {

		this.writeLock.lock();

		try {
			ApplicationId appID = event.getApplicationId();
			LOG.debug("Processing event for " + appID + " of type " + event.getType());
			final RMAppState oldState = getState();
			try {
				/* keep the master in sync with the state machine */
				this.stateMachine.doTransition(event.getType(), event);
			} catch (InvalidStateTransitonException e) {
				LOG.error("Can't handle this event at current state", e);
				/* TODO fail the application on the failed transition */
			}

			if (oldState != getState()) {
				LOG.info(appID + " State change from " + oldState + " to " + getState());
			}
		} finally {
			this.writeLock.unlock();
		}
	}

可以看到,里面只进行了状态机的转换,这就要从状态机的初始化开始说起了,此处不予赘述,直接上一图:


这里上张状态机转换的示意图,大家知道这个意思就行了,状态机的内部机制大致如此,RMAppImpl内部的状态机初始状态时RMAppState.NEW,这里调用的transition转化后状态是:RMAppState.START,转化函数定义在RMAppNewlySavingTransition类内部,这是个单边转换,如上:

addTransition(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppEventType.START,
							new RMAppNewlySavingTransition())

我们看看这个类的内部实现:

@Override
		public void transition(RMAppImpl app, RMAppEvent event) {

			// If recovery is enabled then store the application information in a
			// non-blocking call so make sure that RM has stored the information
			// needed to restart the AM after RM restart without further client
			// communication
			LOG.info("Storing application with id " + app.applicationId);
			app.rmContext.getStateStore().storeNewApplication(app);
		}

这里的逻辑明确一下:RMAppImpl内部的状态机调用了doTransition方法,而实际上这个状态机的实现类是:

this.stateMachine = stateMachineFactory.make(this);
public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
		return new InternalStateMachine(operand, defaultInitialState);
	}

InternalStateMachine调用doTransition方法:

@Override
		public synchronized STATE doTransition(EVENTTYPE eventType, EVENT event) throws InvalidStateTransitonException {
			currentState = StateMachineFactory.this.doTransition(operand, currentState, eventType, event);
			return currentState;
		}
private STATE doTransition(OPERAND operand, STATE oldState, EVENTTYPE eventType, EVENT event)
			throws InvalidStateTransitonException {
		// We can assume that stateMachineTable is non-null because we call
		// maybeMakeStateMachineTable() when we build an InnerStateMachine ,
		// and this code only gets called from inside a working InnerStateMachine .
		Map<EVENTTYPE, Transition<OPERAND, STATE, EVENTTYPE, EVENT>> transitionMap = stateMachineTable.get(oldState);
		if (transitionMap != null) {
			Transition<OPERAND, STATE, EVENTTYPE, EVENT> transition = transitionMap.get(eventType);
			if (transition != null) {
				return transition.doTransition(operand, oldState, event, eventType);
			}
		}
		throw new InvalidStateTransitonException(oldState, eventType);
	}
private class SingleInternalArc implements Transition<OPERAND, STATE, EVENTTYPE, EVENT> {

		private STATE postState;
		private SingleArcTransition<OPERAND, EVENT> hook; // transition hook

		SingleInternalArc(STATE postState, SingleArcTransition<OPERAND, EVENT> hook) {
			this.postState = postState;
			this.hook = hook;
		}

		@Override
		public STATE doTransition(OPERAND operand, STATE oldState, EVENT event, EVENTTYPE eventType) {
			if (hook != null) {
				hook.transition(operand, event);
			}
			return postState;
		}
	}

最后,我们追溯到这里,发现需要执行RMAppNewlyTransition的doTransition方法,这个方法传入了两个参数,调用的时候具体是哪两个参数呢:

/*
	 * @return a {@link StateMachine} that starts in the default initial state and
	 * whose {@link Transition} s are applied to {@code operand} .
	 *
	 * This is part of the API.
	 *
	 * @param operand the object upon which the returned {@link StateMachine} will
	 * operate.
	 * 
	 */
	public StateMachine<STATE, EVENTTYPE, EVENT> make(OPERAND operand) {
		return new InternalStateMachine(operand, defaultInitialState);
	}

我们注意看下这个make方法,我们在RMAppImpl构造的时候调用了这个,传入的operand实际上就是RMAppImpl自身,所以对应的操作,其实就是对RMAppImpl的操作,我们把提交的ApplicationMaster存在了rmContext内:

/**
	 * Non-Blocking API ResourceManager services use this to store the application's
	 * state This does not block the dispatcher threads RMAppStoredEvent will be
	 * sent on completion to notify the RMApp
	 */
	@SuppressWarnings("unchecked")
	public synchronized void storeNewApplication(RMApp app) {
		ApplicationSubmissionContext context = app.getApplicationSubmissionContext();
		assert context instanceof ApplicationSubmissionContextPBImpl;
		ApplicationState appState = new ApplicationState(app.getSubmitTime(), app.getStartTime(), context,
				app.getUser());
		dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState));
	}

这个存储方法如下,注意,接下来的dispatcher是RMStateStore内部的调度器,而非全局调度器,然后把该事件放在了自己的内部事件队列中:

		dispatcher.register(RMStateStoreEventType.class, new ForwardingEventHandler());

注意这里,经过初始化和服务启动,RMStateStore的调度器把此类事件调度给ForwardingEventHandler来处理:

private final class ForwardingEventHandler implements EventHandler<RMStateStoreEvent> {

		@Override
		public void handle(RMStateStoreEvent event) {
			handleStoreEvent(event);
		}
	}
// Dispatcher related code
	protected void handleStoreEvent(RMStateStoreEvent event) {
		try {
			this.stateMachine.doTransition(event.getType(), event);
		} catch (InvalidStateTransitonException e) {
			LOG.error("Can't handle this event at current state", e);
		}
	}

这里,传入的事件类型实际上是:

public RMStateStoreAppEvent(ApplicationState appState) {
    super(RMStateStoreEventType.STORE_APP);
    this.appState = appState;
  }
我们看看相应类的doTransition方法:
addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT,
							RMStateStoreEventType.STORE_APP, new StoreAppTransition())
private static class StoreAppTransition implements SingleArcTransition<RMStateStore, RMStateStoreEvent> {
		@Override
		public void transition(RMStateStore store, RMStateStoreEvent event) {
			if (!(event instanceof RMStateStoreAppEvent)) {
				// should never happen
				LOG.error("Illegal event type: " + event.getClass());
				return;
			}
			ApplicationState appState = ((RMStateStoreAppEvent) event).getAppState();
			ApplicationId appId = appState.getAppId();
			ApplicationStateData appStateData = ApplicationStateData.newInstance(appState);
			LOG.info("Storing info for app: " + appId);
			try {
				store.storeApplicationStateInternal(appId, appStateData);
				store.notifyApplication(new RMAppEvent(appId, RMAppEventType.APP_NEW_SAVED));
			} catch (Exception e) {
				LOG.error("Error storing app: " + appId, e);
				store.notifyStoreOperationFailed(e);
			}
		};
	}

看起来依旧是存储的逻辑,这就不分析了,继续往下走,看看notifyApplication方法:

@SuppressWarnings("unchecked")
	/**
	 * This method is called to notify the application that new application is
	 * stored or updated in state store
	 * 
	 * @param event
	 *            App event containing the app id and event type
	 */
	private void notifyApplication(RMAppEvent event) {
		rmDispatcher.getEventHandler().handle(event);
	}

发现,内部处理完毕之后,再次把事件传给了全局调度器,即RMDispatcher,我们看看其是如何处理这类事件的:

addTransition(RMAppState.NEW_SAVING, RMAppState.SUBMITTED, RMAppEventType.APP_NEW_SAVED,
							new AddApplicationToSchedulerTransition())

这里的关系有点绕,但实际上,此类事件就是RMAppImpl来处理的,对应的代码在RM内:

// Register event handler for RmAppEvents
			rmDispatcher.register(RMAppEventType.class, new ApplicationEventDispatcher(rmContext));
@Override
		public void handle(RMAppEvent event) {
			ApplicationId appID = event.getApplicationId();
			RMApp rmApp = this.rmContext.getRMApps().get(appID);
			if (rmApp != null) {
				try {
					rmApp.handle(event);
				} catch (Throwable t) {
					LOG.error("Error in handling event type " + event.getType() + " for application " + appID, t);
				}
			}
		}

这里,大家注意一点,为什么最开始要先申请到ApplicationId,这个太重要了,在整个程序运行过程中,对应于ApplicationId,有一个一直存在的RMAppImpl,而后面的所有操作,基本都是围绕着这个RMAppImpl的,而如何找到这个,则是通过id的key:

private static final class AddApplicationToSchedulerTransition extends RMAppTransition {
		@Override
		public void transition(RMAppImpl app, RMAppEvent event) {
			app.handler.handle(new AppAddedSchedulerEvent(app.applicationId, app.submissionContext.getQueue(), app.user,
					app.submissionContext.getReservationID()));
		}
	}

接下来的处理代码,用到了RMAppImpl内部的handler的handle方法,再次提交了一个事件,事件的目的,是告知scheduler开始调度,进行ApplicationMaster的初始化:

而如何进行初始化,以及后续的ApplicationMaster启动,请听下回分解:

猜你喜欢

转载自blog.csdn.net/u013384984/article/details/80257613
今日推荐