因为AM被设计成只负责一个framework,所以一个AM的生命周期其实就是一个framework的生命周期。下面总结的流程是一个framework从开始到结束的过程,不包含task 失败的情况,假设是一个健健康康的framework。
一、涉及到的AM subService简介
主要是requestManager和statusManger
1、requestManager
requestManager中有一个线程会在AM的整个生命周期中,每隔30s去zk上pullRequest()。当然pull的是同一个frameworkLauncher的有关信息,由LauncherRequest和aggreatedFrameworkRequest两部分组成。
其中aggregatedFrameworkRequest = FrameworkRequest + all other feedback reqeust
LauncherRequest中有ClusterConfigurattion和其他信息。
pull下来之后会检查一些条件并更新requestManager中与任务运行有关的内容,比如FrameworkDescriptor、TaskRoles、platParams等,目的应该是保持AM与zkStore上的内容同步吧。
private void pullRequest() throws Exception { // Pull LauncherRequest LOGGER.logDebug("Pulling LauncherRequest"); LauncherRequest newLauncherRequest = zkStore.getLauncherRequest(); LOGGER.logDebug("Pulled LauncherRequest"); // newLauncherRequest is always not null updateLauncherRequest(newLauncherRequest); // Pull AggregatedFrameworkRequest AggregatedFrameworkRequest aggFrameworkRequest; try { LOGGER.logDebug("Pulling AggregatedFrameworkRequest"); aggFrameworkRequest = zkStore.getAggregatedFrameworkRequest(conf.getFrameworkName()); LOGGER.logDebug("Pulled AggregatedFrameworkRequest"); } catch (NoNodeException e) { existsLocalVersionFrameworkRequest = 0; throw new NonTransientException( "Failed to getAggregatedFrameworkRequest, FrameworkRequest is already deleted on ZK", e); } // newFrameworkDescriptor is always not null FrameworkDescriptor newFrameworkDescriptor = aggFrameworkRequest.getFrameworkRequest().getFrameworkDescriptor(); checkFrameworkVersion(newFrameworkDescriptor); flattenFrameworkDescriptor(newFrameworkDescriptor); updateFrameworkDescriptor(newFrameworkDescriptor); updateOverrideApplicationProgressRequest(aggFrameworkRequest.getOverrideApplicationProgressRequest()); updateMigrateTaskRequests(aggFrameworkRequest.getMigrateTaskRequests()); }
2、statusManager
管理framework的运行情况。而framework由一系列taskRole组成,每个taskRole中会有一个或多个task,这些task都是相同的,因此可以用标号1、2、3之类的区分。
确定一个task的对象 TaskLocator = TaskRoleName + TaskIndex
管理一个Task对象 TaskStatus
在statusManager中由以下的Map来管理taskRole和task们:
// Manage the CURD to ZK Status public class StatusManager extends AbstractService { // THREAD SAFE private static final DefaultLogger LOGGER = new DefaultLogger(StatusManager.class); private final ApplicationMaster am; private final Configuration conf; private final ZookeeperStore zkStore; /** * REGION BaseStatus */ // AM only need to maintain TaskRoleStatus and TaskStatuses, and it is the only maintainer. // TaskRoleName -> TaskRoleStatus private Map<String, TaskRoleStatus> taskRoleStatuses = new HashMap<>(); // TaskRoleName -> TaskStatuses private Map<String, TaskStatuses> taskStatuseses = new HashMap<>(); /** * REGION ExtensionStatus * ExtensionStatus should be always CONSISTENT with BaseStatus */ // Used to invert index TaskStatus by ContainerId/TaskState instead of TaskStatusLocator, i.e. TaskRoleName + TaskIndex // TaskState -> TaskStatusLocators private Map<TaskState, HashSet<TaskStatusLocator>> taskStateLocators = new HashMap<>(); // Live Associated ContainerId -> TaskStatusLocator private Map<String, TaskStatusLocator> liveAssociatedContainerIdLocators = new HashMap<>(); // Live Associated HostNames // TODO: Using MachineName instead of HostName to avoid unstable HostName Resolution private HashSet<String> liveAssociatedHostNames = new HashSet<>(); /** * REGION StateVariable */ // Whether Mem Status is changed since previous zkStore update // TaskRoleName -> TaskRoleStatusChanged private Map<String, Boolean> taskRoleStatusesChanged = new HashMap<>(); // TaskRoleName -> TaskStatusesChanged private Map<String, Boolean> taskStatusesesChanged = new HashMap<>(); // No need to persistent ContainerRequest since it is only valid within one application attempt. // Used to generate an unique Priority for each ContainerRequest in current application attempt. // This helps to match ContainerRequest and allocated Container. // Besides, it can also avoid the issue YARN-314. private Priority nextContainerRequestPriority = Priority.newInstance(0); // Used to track current ContainerRequest for Tasks in CONTAINER_REQUESTED state // TaskStatusLocator -> ContainerRequest private Map<TaskStatusLocator, ContainerRequest> taskContainerRequests = new HashMap<>(); // Used to invert index TaskStatusLocator by ContainerRequest.Priority // Priority -> TaskStatusLocator private Map<Priority, TaskStatusLocator> priorityLocators = new HashMap<>();
二、开始运行framework
上面说的RequestManager pullRequest()中将frameworkReqeust和launcherRequest从zk上获得后,调用updateFrameworkDescriptor()方法,该方法及继续调用onTaskNumberUpdated()方法,该方法中会将更新statusMangager、addContainerRequest两个任务扔到AM的线程池中。addContainerRequest()即是由AMRMClient去向RM请求container执行task,RM分配containers后便会触发后面的onContainerAllocated操作,是(二)中描述的内容。
下面是updateFrameworkDescriptor的源码。
private void updateFrameworkDescriptor(FrameworkDescriptor newFrameworkDescriptor) throws Exception { if (YamlUtils.deepEquals(frameworkDescriptor, newFrameworkDescriptor)) { return; } LOGGER.logSplittedLines(Level.INFO, "Detected FrameworkDescriptor changes. Updating to new FrameworkDescriptor:\n%s", WebCommon.toJson(newFrameworkDescriptor)); checkUnsupportedOnTheFlyChanges(newFrameworkDescriptor); // Replace on the fly FrameworkDescriptor with newFrameworkDescriptor. // The operation is Atomic, since it only modifies the reference. // So, the on going read for the old FrameworkDescriptor will not get intermediate results frameworkDescriptor = newFrameworkDescriptor; // Backup old to detect changes PlatformSpecificParametersDescriptor oldPlatParams = platParams; Map<String, TaskRoleDescriptor> oldTaskRoles = taskRoles; Map<String, ServiceDescriptor> oldTaskServices = taskServices; // Update ExtensionRequest user = frameworkDescriptor.getUser(); platParams = frameworkDescriptor.getPlatformSpecificParameters(); taskRoles = frameworkDescriptor.getTaskRoles(); Map<String, RetryPolicyDescriptor> newTaskRetryPolicies = new HashMap<>(); Map<String, ServiceDescriptor> newTaskServices = new HashMap<>(); Map<String, ResourceDescriptor> newTaskResources = new HashMap<>(); Map<String, TaskRolePlatformSpecificParametersDescriptor> newTaskPlatParams = new HashMap<>(); for (Map.Entry<String, TaskRoleDescriptor> taskRole : taskRoles.entrySet()) { String taskRoleName = taskRole.getKey(); TaskRoleDescriptor taskRoleDescriptor = taskRole.getValue(); newTaskRetryPolicies.put(taskRoleName, taskRoleDescriptor.getTaskRetryPolicy()); newTaskServices.put(taskRoleName, taskRoleDescriptor.getTaskService()); newTaskResources.put(taskRoleName, taskRoleDescriptor.getTaskService().getResource()); newTaskPlatParams.put(taskRoleName, taskRoleDescriptor.getPlatformSpecificParameters()); } taskRetryPolicies = newTaskRetryPolicies; taskServices = newTaskServices; taskResources = newTaskResources; taskPlatParams = newTaskPlatParams; Map<String, Integer> taskNumbers = getTaskNumbers(taskRoles); Map<String, Integer> serviceVersions = getServiceVersions(taskServices); // Notify AM to take actions for Request if (oldPlatParams == null) { // For the first time, send all Request to AM am.onServiceVersionsUpdated(serviceVersions); am.onTaskNumbersUpdated(taskNumbers); { // Only start them for the first time am.onStartRMResyncHandler(); // Start TransitionTaskStateQueue at last, in case some Tasks in the queue // depend on the Request or previous AM Notify. am.onStartTransitionTaskStateQueue(); } } else { // For the other times, only send changed Request to AM if (!CommonExts.equals(getServiceVersions(oldTaskServices), serviceVersions)) { am.onServiceVersionsUpdated(serviceVersions); } if (!CommonExts.equals(getTaskNumbers(oldTaskRoles), taskNumbers)) { am.onTaskNumbersUpdated(taskNumbers); } } }
三、结束framework和AM
入口时RMClientallbackHandler中的onContainersCompleted()方法。
public void onContainersCompleted(List<ContainerStatus> completedContainers) { am.onContainersCompleted(completedContainers); }
那么当RM告诉你有一堆containers结束了之后怎么做呢?用for循环一个个检查。这个检查任务也是扔进线程池里做的。
对每一个结束的container,提取出它的containerId、exitStatus、diagnotics进入下一步。
private void completeContainers(List<ContainerStatus> containerStatuses) throws Exception { for (ContainerStatus containerStatus : containerStatuses) { completeContainer( containerStatus.getContainerId().toString(), containerStatus.getExitStatus(), containerStatus.getDiagnostics(), false); } }
下一步呢,就是从container中获得当前container所运行的taskStatus,将该task状态标记为CONTAINER_COMPLETED,然后调用attemptToRetry()做“尸检”。
private void completeContainer(String containerId, int exitCode, String diagnostics, Boolean needToRelease) throws Exception { if (needToRelease) { tryToReleaseContainer(containerId); if (exitCode == ExitStatusKey.CONTAINER_MIGRATE_TASK_REQUESTED.toInt()) { requestManager.onMigrateTaskRequestContainerReleased(containerId); } } String logSuffix = String.format( "[%s]: completeContainer: ExitCode: %s, ExitDiagnostics: %s, NeedToRelease: %s", containerId, exitCode, diagnostics, needToRelease); if (!statusManager.isContainerIdLiveAssociated(containerId)) { LOGGER.logDebug("[NotLiveAssociated]%s", logSuffix); return; } TaskStatus taskStatus = statusManager.getTaskStatusWithLiveAssociatedContainerId(containerId); String taskRoleName = taskStatus.getTaskRoleName(); TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex()); String linePrefix = String.format("%s: ", taskLocator); LOGGER.logSplittedLines(Level.INFO, "%s%s\n%s", taskLocator, logSuffix, generateContainerDiagnostics(taskStatus, linePrefix)); statusManager.transitionTaskState(taskLocator, TaskState.CONTAINER_COMPLETED, new TaskEvent().setContainerExitCode(exitCode).setContainerExitDiagnostics(diagnostics)); // Post-mortem CONTAINER_COMPLETED Task attemptToRetry(taskStatus); }
尸检部分进一步检查container的退出状态是否是SUCCEEDED。如果不是SUCCEEDED,有fancyRetryPolicy和normalRetryPolicy两种retry方式。现在主要看如果成功退出了,
private void attemptToRetry(TaskStatus taskStatus) throws Exception { String taskRoleName = taskStatus.getTaskRoleName(); TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex()); Integer exitCode = taskStatus.getContainerExitCode(); ExitType exitType = taskStatus.getContainerExitType(); Integer retriedCount = taskStatus.getTaskRetryPolicyState().getRetriedCount(); RetryPolicyState newRetryPolicyState = YamlUtils.deepCopy(taskStatus.getTaskRetryPolicyState(), RetryPolicyState.class); String logPrefix = String.format("%s: attemptToRetry: ", taskLocator); LOGGER.logSplittedLines(Level.INFO, logPrefix + "ContainerExitCode: [%s], ContainerExitType: [%s], RetryPolicyState:\n[%s]", exitCode, exitType, WebCommon.toJson(newRetryPolicyState)); // 1. ContainerSucceeded if (exitCode == ExitStatusKey.SUCCEEDED.toInt()) { LOGGER.logInfo(logPrefix + "Will completeTask with TaskSucceeded. Reason: " + "ContainerExitCode = %s.", exitCode); completeTask(taskStatus); return; } // 2. ContainerFailed Boolean generateContainerIpList = requestManager.getPlatParams().getGenerateContainerIpList(); RetryPolicyDescriptor retryPolicy = requestManager.getTaskRetryPolicies().get(taskRoleName); String completeTaskLogPrefix = logPrefix + "Will completeTask with TaskFailed. Reason: "; String retryTaskLogPrefix = logPrefix + "Will retryTask with new Container. Reason: "; // 2.1. Handle Special Case if (generateContainerIpList) { LOGGER.logWarning(completeTaskLogPrefix + "TaskRetryPolicy is ignored due to GenerateContainerIpList enabled."); completeTask(taskStatus); return; } // 2.2. FancyRetryPolicy String fancyRetryPolicyLogSuffix = String.format("FancyRetryPolicy: %s Failure Occurred.", exitType); if (exitType == ExitType.NON_TRANSIENT) { newRetryPolicyState.setNonTransientRetriedCount(newRetryPolicyState.getNonTransientRetriedCount() + 1); if (retryPolicy.getFancyRetryPolicy()) { LOGGER.logWarning(completeTaskLogPrefix + fancyRetryPolicyLogSuffix); completeTask(taskStatus); return; } } else if (exitType == ExitType.TRANSIENT_NORMAL) { newRetryPolicyState.setTransientNormalRetriedCount(newRetryPolicyState.getTransientNormalRetriedCount() + 1); if (retryPolicy.getFancyRetryPolicy()) { LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix); retryTask(taskStatus, newRetryPolicyState); return; } } else if (exitType == ExitType.TRANSIENT_CONFLICT) { newRetryPolicyState.setTransientConflictRetriedCount(newRetryPolicyState.getTransientConflictRetriedCount() + 1); if (retryPolicy.getFancyRetryPolicy()) { LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix); retryTask(taskStatus, newRetryPolicyState); return; } } else { newRetryPolicyState.setUnKnownRetriedCount(newRetryPolicyState.getUnKnownRetriedCount() + 1); if (retryPolicy.getFancyRetryPolicy()) { // FancyRetryPolicy only handle Transient and NON_TRANSIENT Failure specially, // Leave UNKNOWN Failure to NormalRetryPolicy LOGGER.logWarning(logPrefix + "Transfer the RetryDecision to NormalRetryPolicy. Reason: " + fancyRetryPolicyLogSuffix); } } // 2.3. NormalRetryPolicy if (retryPolicy.getMaxRetryCount() != GlobalConstants.USING_UNLIMITED_VALUE && retriedCount >= retryPolicy.getMaxRetryCount()) { LOGGER.logWarning(completeTaskLogPrefix + "RetriedCount %s has reached MaxRetryCount %s.", retriedCount, retryPolicy.getMaxRetryCount()); completeTask(taskStatus); return; } else { newRetryPolicyState.setRetriedCount(newRetryPolicyState.getRetriedCount() + 1); LOGGER.logWarning(retryTaskLogPrefix + "RetriedCount %s has not reached MaxRetryCount %s.", retriedCount, retryPolicy.getMaxRetryCount()); retryTask(taskStatus, newRetryPolicyState); return; } }