Druid源码解析之Coordinator

Coordinator时Druid的中心协调模块，用于解耦各个模块之间的直接联系，负责Segment的管理与分发，控制历史节点上Segment的装载和删除，并保持Segment在各个历史节点上的负载均衡。

Coordinator采用定期运行任务的设计模式。它包含一些不同的任务。Coordinator并不是直接和历史节点发生调用关系，而是通过Zookeeper作为桥梁，将指令发送到Zookeeper上，然后历史节点获取Zookeeper上的指令来装载和卸载Segment。

Coordinator装载和卸载Segment的依据来自于一系列的规则，这些规则可以通过Druid的管理工具或者参数来配置。这些规则包括：

永久装载（LoadForever）
时间段装载（LoadByInterval）
最近时间段装载（LoadByPeriod）

Coordinator的卸载规则如下：

永久卸载（DropForever）
时间段卸载（DropByInterval）
最近时间段卸载（DropByPeriod）

@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
@JsonSubTypes(value = {
    @JsonSubTypes.Type(name = "loadByPeriod", value = PeriodLoadRule.class),
    @JsonSubTypes.Type(name = "loadByInterval", value = IntervalLoadRule.class),
    @JsonSubTypes.Type(name = "loadForever", value = ForeverLoadRule.class),
    @JsonSubTypes.Type(name = "dropByPeriod", value = PeriodDropRule.class),
    @JsonSubTypes.Type(name = "dropByInterval", value = IntervalDropRule.class),
    @JsonSubTypes.Type(name = "dropForever", value = ForeverDropRule.class)
})

Coordinator的代码入口类是：io.druid.server.coordinator.DruidCoordinator

DruidCoordinator引入了几个管理类用于获取Segment与集群的信息和管理能力。例如下图的MetadataSegmentManager和MetadataRuleManager这两个接口都是通过SQL查询方式从MySQL服务器获取规则和Segment信息。

Coordinator的启动是从start()开始的，首先通过ZK的LeaderLatch选取一个Leader。该Leader会定期运行一些任务：

  @LifecycleStart
  public void start()
  {
    synchronized (lock) {
      if (started) {
        return;
      }
      started = true;

      createNewLeaderLatch();
      try {
        leaderLatch.get().start();
      }
      catch (Exception e) {
        throw Throwables.propagate(e);
      }
    }
  }

  private LeaderLatch createNewLeaderLatch()
  {
    final LeaderLatch newLeaderLatch = new LeaderLatch(
        curator, ZKPaths.makePath(zkPaths.getCoordinatorPath(), COORDINATOR_OWNER_NODE), self.getHostAndPort()
    );

    newLeaderLatch.addListener(
        new LeaderLatchListener()
        {
          @Override
          public void isLeader()
          {
            DruidCoordinator.this.becomeLeader();
          }

          @Override
          public void notLeader()
          {
            DruidCoordinator.this.stopBeingLeader();
          }
        },
        Execs.singleThreaded("CoordinatorLeader-%s")
    );

    return leaderLatch.getAndSet(newLeaderLatch);
  }

在已经确定的Leader上，可以开始定期地执行任务了：

  private void becomeLeader()
  {
    synchronized (lock) {
      if (!started) {
        return;
      }

      log.info("I am the leader of the coordinators, all must bow!");
      log.info("Starting coordination in [%s]", config.getCoordinatorStartDelay());
      try {
        leaderCounter++;
        leader = true;
        metadataSegmentManager.start();
        metadataRuleManager.start();
        serverInventoryView.start();
        serviceAnnouncer.announce(self);
        final int startingLeaderCounter = leaderCounter;

        final List<Pair<? extends CoordinatorRunnable, Duration>> coordinatorRunnables = Lists.newArrayList();
        coordinatorRunnables.add(
            Pair.of(
                new CoordinatorHistoricalManagerRunnable(startingLeaderCounter),
                config.getCoordinatorPeriod()
            )
        );
        if (indexingServiceClient != null) {
          coordinatorRunnables.add(
              Pair.of(
                  new CoordinatorIndexingServiceRunnable(
                      makeIndexingServiceHelpers(),
                      startingLeaderCounter
                  ),
                  config.getCoordinatorIndexingPeriod()
              )
          );
        }

        for (final Pair<? extends CoordinatorRunnable, Duration> coordinatorRunnable : coordinatorRunnables) {
          ScheduledExecutors.scheduleWithFixedDelay(
              exec,
              config.getCoordinatorStartDelay(),
              coordinatorRunnable.rhs,
              new Callable<ScheduledExecutors.Signal>()
              {
                private final CoordinatorRunnable theRunnable = coordinatorRunnable.lhs;

                @Override
                public ScheduledExecutors.Signal call()
                {
                  if (leader && startingLeaderCounter == leaderCounter) {
                    theRunnable.run();
                  }
                  if (leader && startingLeaderCounter == leaderCounter) { // (We might no longer be leader)
                    return ScheduledExecutors.Signal.REPEAT;
                  } else {
                    return ScheduledExecutors.Signal.STOP;
                  }
                }
              }
          );
        }
      }
      catch (Exception e) {
        log.makeAlert(e, "Unable to become leader")
           .emit();
        final LeaderLatch oldLatch = createNewLeaderLatch();
        CloseQuietly.close(oldLatch);
        try {
          leaderLatch.get().start();
        }
        catch (Exception e1) {
          // If an exception gets thrown out here, then the coordinator will zombie out 'cause it won't be looking for
          // the latch anymore.  I don't believe it's actually possible for an Exception to throw out here, but
          // Curator likes to have "throws Exception" on methods so it might happen...
          log.makeAlert(e1, "I am a zombie")
             .emit();
        }
      }
    }
  }

其中CoordinatorHistoricalManagerRunnable和CoordinatorIndexingServiceRunnable最为重要。

CoordinatorHistoricalManagerRunnable包括了多个具体任务的集合：

  private class CoordinatorHistoricalManagerRunnable extends CoordinatorRunnable
  {
    public CoordinatorHistoricalManagerRunnable(final int startingLeaderCounter)
    {
      super(
          ImmutableList.of(
              new DruidCoordinatorSegmentInfoLoader(DruidCoordinator.this),
              new DruidCoordinatorHelper()
              {
                @Override
                public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params)
                {
                  // Display info about all historical servers
                  Iterable<ImmutableDruidServer> servers = FunctionalIterable
                      .create(serverInventoryView.getInventory())
                      .filter(
                          new Predicate<DruidServer>()
                          {
                            @Override
                            public boolean apply(
                                DruidServer input
                            )
                            {
                              return input.isAssignable();
                            }
                          }
                      ).transform(
                          new Function<DruidServer, ImmutableDruidServer>()
                          {
                            @Override
                            public ImmutableDruidServer apply(DruidServer input)
                            {
                              return input.toImmutableDruidServer();
                            }
                          }
                      );

                  if (log.isDebugEnabled()) {
                    log.debug("Servers");
                    for (ImmutableDruidServer druidServer : servers) {
                      log.debug("  %s", druidServer);
                      log.debug("    -- DataSources");
                      for (ImmutableDruidDataSource druidDataSource : druidServer.getDataSources()) {
                        log.debug("    %s", druidDataSource);
                      }
                    }
                  }

                  // Find all historical servers, group them by subType and sort by ascending usage
                  final DruidCluster cluster = new DruidCluster();
                  for (ImmutableDruidServer server : servers) {
                    if (!loadManagementPeons.containsKey(server.getName())) {
                      String basePath = ZKPaths.makePath(zkPaths.getLoadQueuePath(), server.getName());
                      LoadQueuePeon loadQueuePeon = taskMaster.giveMePeon(basePath);
                      log.info("Creating LoadQueuePeon for server[%s] at path[%s]", server.getName(), basePath);

                      loadManagementPeons.put(server.getName(), loadQueuePeon);
                    }

                    cluster.add(new ServerHolder(server, loadManagementPeons.get(server.getName())));
                  }

                  segmentReplicantLookup = SegmentReplicantLookup.make(cluster);

                  // Stop peons for servers that aren't there anymore.
                  final Set<String> disappeared = Sets.newHashSet(loadManagementPeons.keySet());
                  for (ImmutableDruidServer server : servers) {
                    disappeared.remove(server.getName());
                  }
                  for (String name : disappeared) {
                    log.info("Removing listener for server[%s] which is no longer there.", name);
                    LoadQueuePeon peon = loadManagementPeons.remove(name);
                    peon.stop();
                  }

                  return params.buildFromExisting()
                               .withDruidCluster(cluster)
                               .withDatabaseRuleManager(metadataRuleManager)
                               .withLoadManagementPeons(loadManagementPeons)
                               .withSegmentReplicantLookup(segmentReplicantLookup)
                               .withBalancerReferenceTimestamp(DateTime.now())
                               .build();
                }
              },
              new DruidCoordinatorRuleRunner(DruidCoordinator.this),
              new DruidCoordinatorCleanupUnneeded(DruidCoordinator.this),
              new DruidCoordinatorCleanupOvershadowed(DruidCoordinator.this),
              new DruidCoordinatorBalancer(DruidCoordinator.this),
              new DruidCoordinatorLogger()
          ),
          startingLeaderCounter
      );
    }
  }

其中：

DruidCoordinatorSegmentInfoLoader：用于装载Segment信息，删除无效的Segment信息。
DruidCoordinatorRuleRunner：装载规则信息，并且应用到所有的Segment。
DruidCoordinatorCleanupUnneeded：移除一些无效的Segment，不在MataManager中的Segment。
DruidCoordinatorBalancer：定时整理Segment分布的平衡性，移动部分Segment以平衡负载。

Balancer的想法就是尽量让那些容易被同一个查询覆盖的Segment分布在整个集群的不同历史节点上。最大利用集群的能力，以避免大量查询集中在集群中的某些机器上。

Druid的负载均衡算法在类CostBalancerStrategy中。

Druid源码解析之Coordinator

猜你喜欢