PAI FrameworkLauncher(3) 一个AM的生命周期

因为AM被设计成只负责一个framework，所以一个AM的生命周期其实就是一个framework的生命周期。下面总结的流程是一个framework从开始到结束的过程，不包含task 失败的情况，假设是一个健健康康的framework。

一、涉及到的AM subService简介

主要是requestManager和statusManger

1、requestManager
requestManager中有一个线程会在AM的整个生命周期中，每隔30s去zk上pullRequest()。当然pull的是同一个frameworkLauncher的有关信息，由LauncherRequest和aggreatedFrameworkRequest两部分组成。

其中aggregatedFrameworkRequest = FrameworkRequest + all other feedback reqeust

LauncherRequest中有ClusterConfigurattion和其他信息。

pull下来之后会检查一些条件并更新requestManager中与任务运行有关的内容，比如FrameworkDescriptor、TaskRoles、platParams等，目的应该是保持AM与zkStore上的内容同步吧。

  private void pullRequest() throws Exception {
    // Pull LauncherRequest
    LOGGER.logDebug("Pulling LauncherRequest");
    LauncherRequest newLauncherRequest = zkStore.getLauncherRequest();
    LOGGER.logDebug("Pulled LauncherRequest");

    // newLauncherRequest is always not null
    updateLauncherRequest(newLauncherRequest);

    // Pull AggregatedFrameworkRequest
    AggregatedFrameworkRequest aggFrameworkRequest;
    try {
      LOGGER.logDebug("Pulling AggregatedFrameworkRequest");
      aggFrameworkRequest = zkStore.getAggregatedFrameworkRequest(conf.getFrameworkName());
      LOGGER.logDebug("Pulled AggregatedFrameworkRequest");
    } catch (NoNodeException e) {
      existsLocalVersionFrameworkRequest = 0;
      throw new NonTransientException(
          "Failed to getAggregatedFrameworkRequest, FrameworkRequest is already deleted on ZK", e);
    }

    // newFrameworkDescriptor is always not null
    FrameworkDescriptor newFrameworkDescriptor = aggFrameworkRequest.getFrameworkRequest().getFrameworkDescriptor();
    checkFrameworkVersion(newFrameworkDescriptor);
    flattenFrameworkDescriptor(newFrameworkDescriptor);
    updateFrameworkDescriptor(newFrameworkDescriptor);
    updateOverrideApplicationProgressRequest(aggFrameworkRequest.getOverrideApplicationProgressRequest());
    updateMigrateTaskRequests(aggFrameworkRequest.getMigrateTaskRequests());
  }

View Code

2、statusManager

管理framework的运行情况。而framework由一系列taskRole组成，每个taskRole中会有一个或多个task，这些task都是相同的，因此可以用标号1、2、3之类的区分。

确定一个task的对象 TaskLocator = TaskRoleName + TaskIndex

管理一个Task对象 TaskStatus

在statusManager中由以下的Map来管理taskRole和task们：

// Manage the CURD to ZK Status
public class StatusManager extends AbstractService {  // THREAD SAFE
  private static final DefaultLogger LOGGER = new DefaultLogger(StatusManager.class);

  private final ApplicationMaster am;
  private final Configuration conf;
  private final ZookeeperStore zkStore;

  /**
   * REGION BaseStatus
   */
  // AM only need to maintain TaskRoleStatus and TaskStatuses, and it is the only maintainer.
  // TaskRoleName -> TaskRoleStatus
  private Map<String, TaskRoleStatus> taskRoleStatuses = new HashMap<>();
  // TaskRoleName -> TaskStatuses
  private Map<String, TaskStatuses> taskStatuseses = new HashMap<>();

  /**
   * REGION ExtensionStatus
   * ExtensionStatus should be always CONSISTENT with BaseStatus
   */
  // Used to invert index TaskStatus by ContainerId/TaskState instead of TaskStatusLocator, i.e. TaskRoleName + TaskIndex
  // TaskState -> TaskStatusLocators
  private Map<TaskState, HashSet<TaskStatusLocator>> taskStateLocators = new HashMap<>();
  // Live Associated ContainerId -> TaskStatusLocator
  private Map<String, TaskStatusLocator> liveAssociatedContainerIdLocators = new HashMap<>();
  // Live Associated HostNames
  // TODO: Using MachineName instead of HostName to avoid unstable HostName Resolution
  private HashSet<String> liveAssociatedHostNames = new HashSet<>();

  /**
   * REGION StateVariable
   */
  // Whether Mem Status is changed since previous zkStore update
  // TaskRoleName -> TaskRoleStatusChanged
  private Map<String, Boolean> taskRoleStatusesChanged = new HashMap<>();
  // TaskRoleName -> TaskStatusesChanged
  private Map<String, Boolean> taskStatusesesChanged = new HashMap<>();

  // No need to persistent ContainerRequest since it is only valid within one application attempt.
  // Used to generate an unique Priority for each ContainerRequest in current application attempt.
  // This helps to match ContainerRequest and allocated Container.
  // Besides, it can also avoid the issue YARN-314.
  private Priority nextContainerRequestPriority = Priority.newInstance(0);
  // Used to track current ContainerRequest for Tasks in CONTAINER_REQUESTED state
  // TaskStatusLocator -> ContainerRequest
  private Map<TaskStatusLocator, ContainerRequest> taskContainerRequests = new HashMap<>();
  // Used to invert index TaskStatusLocator by ContainerRequest.Priority
  // Priority -> TaskStatusLocator
  private Map<Priority, TaskStatusLocator> priorityLocators = new HashMap<>();

View Code

二、开始运行framework

上面说的RequestManager pullRequest()中将frameworkReqeust和launcherRequest从zk上获得后，调用updateFrameworkDescriptor()方法，该方法及继续调用onTaskNumberUpdated()方法，该方法中会将更新statusMangager、addContainerRequest两个任务扔到AM的线程池中。addContainerRequest()即是由AMRMClient去向RM请求container执行task，RM分配containers后便会触发后面的onContainerAllocated操作，是(二)中描述的内容。

下面是updateFrameworkDescriptor的源码。

private void updateFrameworkDescriptor(FrameworkDescriptor newFrameworkDescriptor) throws Exception {
    if (YamlUtils.deepEquals(frameworkDescriptor, newFrameworkDescriptor)) {
      return;
    }

    LOGGER.logSplittedLines(Level.INFO,
        "Detected FrameworkDescriptor changes. Updating to new FrameworkDescriptor:\n%s",
        WebCommon.toJson(newFrameworkDescriptor));

    checkUnsupportedOnTheFlyChanges(newFrameworkDescriptor);

    // Replace on the fly FrameworkDescriptor with newFrameworkDescriptor.
    // The operation is Atomic, since it only modifies the reference.
    // So, the on going read for the old FrameworkDescriptor will not get intermediate results
    frameworkDescriptor = newFrameworkDescriptor;

    // Backup old to detect changes
    PlatformSpecificParametersDescriptor oldPlatParams = platParams;
    Map<String, TaskRoleDescriptor> oldTaskRoles = taskRoles;
    Map<String, ServiceDescriptor> oldTaskServices = taskServices;

    // Update ExtensionRequest
    user = frameworkDescriptor.getUser();
    platParams = frameworkDescriptor.getPlatformSpecificParameters();
    taskRoles = frameworkDescriptor.getTaskRoles();
    Map<String, RetryPolicyDescriptor> newTaskRetryPolicies = new HashMap<>();
    Map<String, ServiceDescriptor> newTaskServices = new HashMap<>();
    Map<String, ResourceDescriptor> newTaskResources = new HashMap<>();
    Map<String, TaskRolePlatformSpecificParametersDescriptor> newTaskPlatParams = new HashMap<>();
    for (Map.Entry<String, TaskRoleDescriptor> taskRole : taskRoles.entrySet()) {
      String taskRoleName = taskRole.getKey();
      TaskRoleDescriptor taskRoleDescriptor = taskRole.getValue();
      newTaskRetryPolicies.put(taskRoleName, taskRoleDescriptor.getTaskRetryPolicy());
      newTaskServices.put(taskRoleName, taskRoleDescriptor.getTaskService());
      newTaskResources.put(taskRoleName, taskRoleDescriptor.getTaskService().getResource());
      newTaskPlatParams.put(taskRoleName, taskRoleDescriptor.getPlatformSpecificParameters());
    }
    taskRetryPolicies = newTaskRetryPolicies;
    taskServices = newTaskServices;
    taskResources = newTaskResources;
    taskPlatParams = newTaskPlatParams;
    Map<String, Integer> taskNumbers = getTaskNumbers(taskRoles);
    Map<String, Integer> serviceVersions = getServiceVersions(taskServices);

    // Notify AM to take actions for Request
    if (oldPlatParams == null) {
      // For the first time, send all Request to AM
      am.onServiceVersionsUpdated(serviceVersions);
      am.onTaskNumbersUpdated(taskNumbers);
      {
        // Only start them for the first time
        am.onStartRMResyncHandler();
        // Start TransitionTaskStateQueue at last, in case some Tasks in the queue
        // depend on the Request or previous AM Notify.
        am.onStartTransitionTaskStateQueue();
      }
    } else {
      // For the other times, only send changed Request to AM
      if (!CommonExts.equals(getServiceVersions(oldTaskServices), serviceVersions)) {
        am.onServiceVersionsUpdated(serviceVersions);
      }
      if (!CommonExts.equals(getTaskNumbers(oldTaskRoles), taskNumbers)) {
        am.onTaskNumbersUpdated(taskNumbers);
      }
    }
  }

View Code

三、结束framework和AM

入口时RMClientallbackHandler中的onContainersCompleted()方法。

public void onContainersCompleted(List<ContainerStatus> completedContainers) {
    am.onContainersCompleted(completedContainers);
  }

View Code

那么当RM告诉你有一堆containers结束了之后怎么做呢？用for循环一个个检查。这个检查任务也是扔进线程池里做的。

对每一个结束的container，提取出它的containerId、exitStatus、diagnotics进入下一步。

private void completeContainers(List<ContainerStatus> containerStatuses) throws Exception {
    for (ContainerStatus containerStatus : containerStatuses) {
      completeContainer(
          containerStatus.getContainerId().toString(),
          containerStatus.getExitStatus(),
          containerStatus.getDiagnostics(),
          false);
    }
  }

View Code

下一步呢，就是从container中获得当前container所运行的taskStatus，将该task状态标记为CONTAINER_COMPLETED，然后调用attemptToRetry()做“尸检”。

  private void completeContainer(String containerId, int exitCode, String diagnostics, Boolean needToRelease) throws Exception {
    if (needToRelease) {
      tryToReleaseContainer(containerId);
      if (exitCode == ExitStatusKey.CONTAINER_MIGRATE_TASK_REQUESTED.toInt()) {
        requestManager.onMigrateTaskRequestContainerReleased(containerId);
      }
    }

    String logSuffix = String.format(
        "[%s]: completeContainer: ExitCode: %s, ExitDiagnostics: %s, NeedToRelease: %s",
        containerId, exitCode, diagnostics, needToRelease);

    if (!statusManager.isContainerIdLiveAssociated(containerId)) {
      LOGGER.logDebug("[NotLiveAssociated]%s", logSuffix);
      return;
    }

    TaskStatus taskStatus = statusManager.getTaskStatusWithLiveAssociatedContainerId(containerId);
    String taskRoleName = taskStatus.getTaskRoleName();
    TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex());
    String linePrefix = String.format("%s: ", taskLocator);

    LOGGER.logSplittedLines(Level.INFO,
        "%s%s\n%s",
        taskLocator, logSuffix, generateContainerDiagnostics(taskStatus, linePrefix));

    statusManager.transitionTaskState(taskLocator, TaskState.CONTAINER_COMPLETED,
        new TaskEvent().setContainerExitCode(exitCode).setContainerExitDiagnostics(diagnostics));

    // Post-mortem CONTAINER_COMPLETED Task
    attemptToRetry(taskStatus);
  }

View Code

尸检部分进一步检查container的退出状态是否是SUCCEEDED。如果不是SUCCEEDED，有fancyRetryPolicy和normalRetryPolicy两种retry方式。现在主要看如果成功退出了，

  private void attemptToRetry(TaskStatus taskStatus) throws Exception {
    String taskRoleName = taskStatus.getTaskRoleName();
    TaskStatusLocator taskLocator = new TaskStatusLocator(taskRoleName, taskStatus.getTaskIndex());
    Integer exitCode = taskStatus.getContainerExitCode();
    ExitType exitType = taskStatus.getContainerExitType();
    Integer retriedCount = taskStatus.getTaskRetryPolicyState().getRetriedCount();
    RetryPolicyState newRetryPolicyState = YamlUtils.deepCopy(taskStatus.getTaskRetryPolicyState(), RetryPolicyState.class);
    String logPrefix = String.format("%s: attemptToRetry: ", taskLocator);

    LOGGER.logSplittedLines(Level.INFO,
        logPrefix + "ContainerExitCode: [%s], ContainerExitType: [%s], RetryPolicyState:\n[%s]",
        exitCode, exitType, WebCommon.toJson(newRetryPolicyState));

    // 1. ContainerSucceeded
    if (exitCode == ExitStatusKey.SUCCEEDED.toInt()) {
      LOGGER.logInfo(logPrefix +
          "Will completeTask with TaskSucceeded. Reason: " +
          "ContainerExitCode = %s.", exitCode);

      completeTask(taskStatus);
      return;
    }

    // 2. ContainerFailed
    Boolean generateContainerIpList = requestManager.getPlatParams().getGenerateContainerIpList();
    RetryPolicyDescriptor retryPolicy = requestManager.getTaskRetryPolicies().get(taskRoleName);
    String completeTaskLogPrefix = logPrefix + "Will completeTask with TaskFailed. Reason: ";
    String retryTaskLogPrefix = logPrefix + "Will retryTask with new Container. Reason: ";

    // 2.1. Handle Special Case
    if (generateContainerIpList) {
      LOGGER.logWarning(completeTaskLogPrefix +
          "TaskRetryPolicy is ignored due to GenerateContainerIpList enabled.");

      completeTask(taskStatus);
      return;
    }

    // 2.2. FancyRetryPolicy
    String fancyRetryPolicyLogSuffix = String.format("FancyRetryPolicy: %s Failure Occurred.", exitType);
    if (exitType == ExitType.NON_TRANSIENT) {
      newRetryPolicyState.setNonTransientRetriedCount(newRetryPolicyState.getNonTransientRetriedCount() + 1);
      if (retryPolicy.getFancyRetryPolicy()) {
        LOGGER.logWarning(completeTaskLogPrefix + fancyRetryPolicyLogSuffix);
        completeTask(taskStatus);
        return;
      }
    } else if (exitType == ExitType.TRANSIENT_NORMAL) {
      newRetryPolicyState.setTransientNormalRetriedCount(newRetryPolicyState.getTransientNormalRetriedCount() + 1);
      if (retryPolicy.getFancyRetryPolicy()) {
        LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix);
        retryTask(taskStatus, newRetryPolicyState);
        return;
      }
    } else if (exitType == ExitType.TRANSIENT_CONFLICT) {
      newRetryPolicyState.setTransientConflictRetriedCount(newRetryPolicyState.getTransientConflictRetriedCount() + 1);
      if (retryPolicy.getFancyRetryPolicy()) {
        LOGGER.logWarning(retryTaskLogPrefix + fancyRetryPolicyLogSuffix);
        retryTask(taskStatus, newRetryPolicyState);
        return;
      }
    } else {
      newRetryPolicyState.setUnKnownRetriedCount(newRetryPolicyState.getUnKnownRetriedCount() + 1);
      if (retryPolicy.getFancyRetryPolicy()) {
        // FancyRetryPolicy only handle Transient and NON_TRANSIENT Failure specially,
        // Leave UNKNOWN Failure to NormalRetryPolicy
        LOGGER.logWarning(logPrefix +
            "Transfer the RetryDecision to NormalRetryPolicy. Reason: " +
            fancyRetryPolicyLogSuffix);
      }
    }

    // 2.3. NormalRetryPolicy
    if (retryPolicy.getMaxRetryCount() != GlobalConstants.USING_UNLIMITED_VALUE &&
        retriedCount >= retryPolicy.getMaxRetryCount()) {
      LOGGER.logWarning(completeTaskLogPrefix +
              "RetriedCount %s has reached MaxRetryCount %s.",
          retriedCount, retryPolicy.getMaxRetryCount());
      completeTask(taskStatus);
      return;
    } else {
      newRetryPolicyState.setRetriedCount(newRetryPolicyState.getRetriedCount() + 1);

      LOGGER.logWarning(retryTaskLogPrefix +
              "RetriedCount %s has not reached MaxRetryCount %s.",
          retriedCount, retryPolicy.getMaxRetryCount());
      retryTask(taskStatus, newRetryPolicyState);
      return;
    }
  }

View Code

PAI FrameworkLauncher(3) 一个AM的生命周期

猜你喜欢