Kafka 启动过程

1, 每个broker启动的时候都会去注册一个临时节点 /controller, 那个broker先注册这个节点,那个就是所有broker的leader,并将自己的信息写入到这个临时节点里面。如下:

[zk: 10.3.63.204:2181,10.3.63.205:2181(CONNECTED) 3] get /controller
{"version":1,"brokerid":0,"timestamp":"1407310302044"}
cZxid = 0x700000592
ctime = Wed Aug 06 15:32:01 CST 2014
mZxid = 0x700000592
mtime = Wed Aug 06 15:32:01 CST 2014
pZxid = 0x700000592
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x147aa389edd0001
dataLength = 54
numChildren = 0

每个broker都会起动kafkaController这个进程, 但只有一个是leader,controller主要是负责删除一些多余的

topic或者其他选举某个topic的pation的leader使用。

2,当关闭的时候,回调用KafkaServer的shutdown方法, 里面会先尝试关闭controller,具体调用代码如下:

CoreUtils.swallow(controlledShutdown())。
代码的逻辑是从zookeeper的controller读出leader的id,并从broker/ids/id读出broker的信息, 然后发送一个

ControlledShutdownRequest的请求到它上面,直到读到成功返回后才说明shutdownSuccessed

3,  具体处理这个请求的逻辑世在KafkaApis中来处理的, 具体的代码如下:

  def handleControlledShutdownRequest(request: RequestChannel.Request) {
    // ensureTopicExists is only for client facing requests
    // We can't have the ensureTopicExists check here since the controller sends it as an advisory to all brokers so they
    // stop serving data to clients for the topic being deleted
    val controlledShutdownRequest = request.requestObj.asInstanceOf[ControlledShutdownRequest]
    val partitionsRemaining = controller.shutdownBroker(controlledShutdownRequest.brokerId)
    val controlledShutdownResponse = new ControlledShutdownResponse(controlledShutdownRequest.correlationId,
      ErrorMapping.NoError, partitionsRemaining)
    requestChannel.sendResponse(new Response(request, new BoundedByteBufferSend(controlledShutdownResponse)))
  }

里面可以看到,发送到主的leader上面,调用KafkaController的 def shutdownBroker(id: Int),工作的具体内容是循环topic的partittion, 然后判断当前的分区是否是主的, // If the broker leads the topic partition, transition the leader and update isr. Updates zk and // notifies all affected brokers

如果不是的,

// Stop the replica first. The state change below initiates ZK changes which should take some time

// before which the stop replica request should be completed (in most cases)

对应的问题是, 如果关闭controller的时间足够长的话,会导致timeout,然后会重新发送关闭的请求。因为锁的缘故,回导致再次的请求也会超时。这样会导致controller的非正常关闭, 重新启动时会有会滚的操作。 虽然这种情况下不会影响到具体的使用。

https://issues.apache.org/jira/browse/KAFKA-1342

分析启动过程:

1, 设置状态为Starting

2, kafkaScheduler.startup - 主要是后台需要定时执行的一些任务

3, initZk - 初始化和zookeeper的链接

4, logManager.startup - 这个主要是通过上面的scheduler来定时循环执行三个任务:kafka-log-retention kafka-log-flusher kafka-recovery-point-checkpoint,如果配置了清理的话,还会起动

5, socketServer.startup,是个NIO的服务, 线程模型如下

1 Acceptor thread that handles new connections

N Processor threads that each have their own selector and read requests from sockets

M Handler threads that handle requests and produce responses back to the processor threads for writing.

6, replicaManager.startup - 主要是通过调度器定时执行 maybeShrinkIsr方法的线程

7, createOffsetManager - 通过调度器启动定时执行 compact 方法的线程

8, kafkaController.startup - 注册zk session失效事件,竞争leader。如果是leader的话,则会回调

kafkaController的 onControllerFailover 方法。

9, consumerCoordinator.startup - Kafka coordinator handles consumer group and consumer offset management 主要是处理消费组和消费者偏移量的问题

10, start processing requests requestHandlerPool-KafkaApis 主要是通过KafkaRequestHandlerPool

来启动处理请求的线程,每个线程实际最后调用的还是KafkaApis

11, 设置状态 runningAsBroker

12, topicConfigManager.startup - 主要监听 /config/changes,然后 Process the given list of config changes

13, tell everyone we are alive - KafkaHealthcheck.startup,主要是和zk保持心跳连接

14, register broker metrics - 主要是一些统计信息

Broker的状态

broker 有以下几种状态

case object NotRunning extends BrokerStates { val state: Byte = 0 }
case object Starting extends BrokerStates { val state: Byte = 1 }
case object RecoveringFromUncleanShutdown extends BrokerStates { val state: Byte = 2 }
case object RunningAsBroker extends BrokerStates { val state: Byte = 3 }
case object RunningAsController extends BrokerStates { val state: Byte = 4 }
case object PendingControlledShutdown extends BrokerStates { val state: Byte = 6 }
case object BrokerShuttingDown extends BrokerStates { val state: Byte = 7 }

状态之间的流转图如下:

/**
 * Broker states are the possible state that a kafka broker can be in.
 * A broker should be only in one state at a time.
 * The expected state transition with the following defined states is:
 *
 *                +-----------+
 *                |Not Running|
 *                +-----+-----+
 *                      |
 *                      v
 *                +-----+-----+
 *                |Starting   +--+
 *                +-----+-----+  | +----+------------+
 *                      |        +>+RecoveringFrom   |
 *                      v          |UncleanShutdown  |
 * +----------+     +-----+-----+  +-------+---------+
 * |RunningAs |     |RunningAs  |            |
 * |Controller+<--->+Broker     +<-----------+
 * +----------+     +-----+-----+
 *        |              |
 *        |              v
 *        |       +-----+------------+
 *        |-----> |PendingControlled |
 *                |Shutdown          |
 *                +-----+------------+
 *                      |
 *                      v
 *               +-----+----------+
 *               |BrokerShutting  |
 *               |Down            |
 *               +-----+----------+
 *                     |
 *                     v
 *               +-----+-----+
 *               |Not Running|
 *               +-----------+
 *
 * Custom states is also allowed for cases where there are custom kafka states for different scenarios.
 */

猜你喜欢

转载自liondinas.iteye.com/blog/2210587