Kafka源码分析03:生产者发送消息的流程

上节课给大家介绍了生产者的缓冲区,这节课我们学习生产者真正发送消息的过程。这个过程主要是由Sender线程类来负责。

发送消息流程介绍

Sender发送消息整体架构图

发送请求

Sender会从RecordAccumulator拉取要发送的消息集合,封装客户端请求ClientRequest,把ClientRequest类对象发送给NetworkClient。下一步是真正的网络发送,Sender会调用Selector的poll()方法把请求真正的发送到broker节点。

接收响应

接下来,Selector会收到broker的响应,Sender根据响应会找到对应的请求,然后调用ProducerBatch内的回调方法完成整个发送响应的流程。

过程更详细的时序图

好,这个过程比较复杂,我们结合时序图再给大家讲解一下:

接下来我们进入源码讲解的环节。

源码讲解

下面这个方法是发送消息的主流程方法:

sendProducerData()


private long sendProducerData(long now) {
    //1.从缓存中获取元数据
    Cluster cluster = metadata.fetch();
    //2.得到应该发送数据的节点
    RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);
    //3.如果主题的 leader 分区对应的节点不存在,就要标注底层通讯层(NetworkClient)需要更新元数据的标识
    if (!result.unknownLeaderTopics.isEmpty()) {
        for (String topic : result.unknownLeaderTopics)
            this.metadata.add(topic, now);
        log.debug("Requesting metadata update due to unknown leader topics from the batched records: {}",
            result.unknownLeaderTopics);
        this.metadata.requestUpdate();
    }
    //4.在result返回的node集合的基础上再检查客户端和node连接是否正常。
    Iterator<Node> iter = result.readyNodes.iterator();
    long notReadyTimeout = Long.MAX_VALUE;
    while (iter.hasNext()) {
        Node node = iter.next();
        //检查node连接是否可用,并且是否可用往这个节点发送数据
        if (!this.client.ready(node, now)) {
            iter.remove();
            notReadyTimeout = Math.min(notReadyTimeout, this.client.pollDelayMs(node, now));
        }
    }

    //5.把要发送的消息转换成按节点组织的集合
    Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes, this.maxRequestSize, now);
    addToInflightBatches(batches);
    if (guaranteeMessageOrder) {
        for (List<ProducerBatch> batchList : batches.values()) {
            for (ProducerBatch batch : batchList)
                this.accumulator.mutePartition(batch.topicPartition);
        }
    }
    accumulator.resetNextBatchExpiryTime();
    //6.收集和处理过期的batch,需要删除accumulator中对应的过期的producerBatch。
    //Sender自定义inflightBatches集合里过期的batch
    List<ProducerBatch> expiredInflightBatches = getExpiredInflightBatches(now);
    //accumulator定义的batches集合里过期的batch
    List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(now);
    expiredBatches.addAll(expiredInflightBatches);
    if (!expiredBatches.isEmpty())
        log.trace("Expired {} batches in accumulator", expiredBatches.size());
    //7.处理过期的batch,及时返回给客户端
    for (ProducerBatch expiredBatch : expiredBatches) {
        String errorMessage = "Expiring " + expiredBatch.recordCount + " record(s) for " + expiredBatch.topicPartition
            + ":" + (now - expiredBatch.createdMs) + " ms has passed since batch creation";
        failBatch(expiredBatch, -1, NO_TIMESTAMP, new TimeoutException(errorMessage), false);
        if (transactionManager != null && expiredBatch.inRetry()) {
            transactionManager.markSequenceUnresolved(expiredBatch);
        }
    }
    sensors.updateProduceRequestMetrics(batches);
    // 设定pollTimeout
    long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
    pollTimeout = Math.min(pollTimeout, this.accumulator.nextExpiryTimeMs() - now);
    pollTimeout = Math.max(pollTimeout, 0);
    if (!result.readyNodes.isEmpty()) {
        log.trace("Nodes with data ready to send: {}", result.readyNodes);
        pollTimeout = 0;
    }
    //8.发送消息
    sendProduceRequests(batches, now);
    return pollTimeout;
}
复制代码

具体发送请求细节:

sendProduceRequest():

 /**
 * 把ProducerBatch类型转换成ClientRequest,并把clientRequest放到KafkaChannel的缓存里。
 * 把要发送到一个节点的 batches 进一步加工成key为分区和value为batches的集合,并把整个集合发送给指定节点
 */

private void sendProduceRequest(long now, int destination, short acks, int timeout, List<ProducerBatch> batches) {
    if (batches.isEmpty())
        return;
    //1.初始化两个集合,produceRecordsByPartition用于构建请求,recordsByPartition用于构建回调方法。用于构建回调方法的 ProducerBatch
    //明显比仅仅用于发送消息的 MemoryRecords 功能多,ProducerBatch里不仅有MemoryRecords还有,回调方法等数据
    Map<TopicPartition, MemoryRecords> produceRecordsByPartition = new HashMap<>(batches.size());
    final Map<TopicPartition, ProducerBatch> recordsByPartition = new HashMap<>(batches.size());

    byte minUsedMagic = apiVersions.maxUsableProduceMagic();
    for (ProducerBatch batch : batches) {
        if (batch.magic() < minUsedMagic)
            minUsedMagic = batch.magic();
    }
    //2.按分区填充produceRecordsByPartition和recordsByPartition两个集合。
    for (ProducerBatch batch : batches) {
        TopicPartition tp = batch.topicPartition;
        //从 recordsBuilder 中取出MemoryRecords,recordsBuilder构造消息批次的任务已经完成,关闭recordsBuilder,
        // 2.1取出ProducerBatch里消息批次的数据。同时会调用关闭方法,关闭方法中一个重要的任务是计算消息批次头信息的大小并写消息批次的头信息
        MemoryRecords records = batch.records();
        if (!records.hasMatchingMagic(minUsedMagic))
            records = batch.records().downConvert(minUsedMagic, 0, time).records();
        produceRecordsByPartition.put(tp, records);
        recordsByPartition.put(tp, batch);
    }
    String transactionalId = null;
    if (transactionManager != null && transactionManager.isTransactional()) {
        transactionalId = transactionManager.transactionalId();
    }
    //3.创建requestBuilder对象。
    ProduceRequest.Builder requestBuilder = ProduceRequest.Builder.forMagic(minUsedMagic, acks, timeout,
            produceRecordsByPartition, transactionalId);
    //4.构建回调
    RequestCompletionHandler callback = response -> handleProduceResponse(response, recordsByPartition, time.milliseconds());
    String nodeId = Integer.toString(destination);
    //5.创建clientRequest,把<分区,消息集>的集合发送给特定的节点。
    ClientRequest clientRequest = client.newClientRequest(nodeId, requestBuilder, now, acks != 0,
            requestTimeoutMs, callback);
    //6.把clientRequest发送给NetworkClient,完成消息的预发送。
    client.send(clientRequest, now);
    log.trace("Sent produce request to {}: {}", nodeId, requestBuilder);
}
复制代码

处理服务端发来的response。

handleProduceResponse()

 /**
 * Handle a produce response
 * 处理服务端发来的response。
 */
private void handleProduceResponse(ClientResponse response, Map<TopicPartition, ProducerBatch> batches, long now) {
    RequestHeader requestHeader = response.requestHeader();
    int correlationId = requestHeader.correlationId();
    // 异常处理
    if (response.wasDisconnected()) {
        log.trace("Cancelled request with header {} due to node {} being disconnected",
            requestHeader, response.destination());
        for (ProducerBatch batch : batches.values())
            completeBatch(batch, new ProduceResponse.PartitionResponse(Errors.NETWORK_EXCEPTION), correlationId, now);
    } else if (response.versionMismatch() != null) {
        log.warn("Cancelled request {} due to a version mismatch with node {}",
                response, response.destination(), response.versionMismatch());
        for (ProducerBatch batch : batches.values())
            completeBatch(batch, new ProduceResponse.PartitionResponse(Errors.UNSUPPORTED_VERSION), correlationId, now);
        // 其他响应
    } else {
        log.trace("Received produce response from node {} with correlation id {}", response.destination(), correlationId);
        if (response.hasResponse()) {
            ProduceResponse produceResponse = (ProduceResponse) response.responseBody();
            for (Map.Entry<TopicPartition, ProduceResponse.PartitionResponse> entry : produceResponse.responses().entrySet()) {
                TopicPartition tp = entry.getKey();
                ProduceResponse.PartitionResponse partResp = entry.getValue();
                ProducerBatch batch = batches.get(tp);
                //调用completeBatch()方法处理。
                completeBatch(batch, partResp, correlationId, now);
            }
            this.sensors.recordLatency(response.destination(), response.requestLatencyMs());
        } else {
            for (ProducerBatch batch : batches.values()) {
                completeBatch(batch, new ProduceResponse.PartitionResponse(Errors.NONE), correlationId, now);
            }
        }
    }
}
复制代码

根据返回结果做具体的处理

completeBatch()

private void completeBatch(ProducerBatch batch, ProduceResponse.PartitionResponse response, long correlationId,long now) {
    Errors error = response.error;
    // 对有异常的响应处理
    // 如果不是要引起重发的异常,或不是重复发送的异常 
    if (error == Errors.MESSAGE_TOO_LARGE && batch.recordCount > 1 && !batch.isDone() &&
            (batch.magic() >= RecordBatch.MAGIC_VALUE_V2 || batch.isCompressed())) {
        log.warn(
            "Got error produce response in correlation id {} on topic-partition {}, splitting and retrying ({} attempts left). Error: {}",
            correlationId,
            batch.topicPartition,
            this.retries - batch.attempts(),
            error);
        if (transactionManager != null)
            transactionManager.removeInFlightBatch(batch);
            this.accumulator.splitAndReenqueue(batch);
            maybeRemoveAndDeallocateBatch(batch);
            this.sensors.recordBatchSplit();
    } else if (error != Errors.NONE) {
        //能否再次发送
        if (canRetry(batch, response, now)) {
            log.warn(
                "Got error produce response with correlation id {} on topic-partition {}, retrying ({} attempts left). Error: {}",
                correlationId,
                batch.topicPartition,
                this.retries - batch.attempts() - 1,
                error);
            //再次入对
            reenqueueBatch(batch, now);
            //重复发送
        } else if (error == Errors.DUPLICATE_SEQUENCE_NUMBER) {
            completeBatch(batch, response);
        } else {
            final RuntimeException exception;
            if (error == Errors.TOPIC_AUTHORIZATION_FAILED)
                exception = new TopicAuthorizationException(Collections.singleton(batch.topicPartition.topic()));
            else if (error == Errors.CLUSTER_AUTHORIZATION_FAILED)
                exception = new ClusterAuthorizationException("The producer is not authorized to do idempotent sends");
            else
                exception = error.exception(response.errorMessage);
                failBatch(batch, response, exception, batch.attempts() < this.retries);
        }
        if (error.exception() instanceof InvalidMetadataException) {
            if (error.exception() instanceof UnknownTopicOrPartitionException) {
                log.warn("Received unknown topic or partition error in produce request on partition {}. The " +
                        "topic-partition may not exist or the user may not have Describe access to it",
                    batch.topicPartition);
            } else {
                log.warn("Received invalid metadata error in produce request on partition {} due to {}. Going " +
                        "to request metadata update now", batch.topicPartition, error.exception(response.errorMessage).toString());
            }
            metadata.requestUpdate();
        }
    } else {
        //对非异常响应的处理,正常执行回调
        completeBatch(batch, response);
    }
    if (guaranteeMessageOrder)
        this.accumulator.unmutePartition(batch.topicPartition);
}
复制代码

最终的处理,删除缓冲区的producerBatch

completeBatch()

private void completeBatch(ProducerBatch batch, ProduceResponse.PartitionResponse response) {
    if (transactionManager != null) {
        transactionManager.handleCompletedBatch(batch, response);
    }
     //执行回调,并释放 accumulator 的空间
    if (batch.done(response.baseOffset, response.logAppendTime, null)) {
        maybeRemoveAndDeallocateBatch(batch);
    }
}
复制代码

是否能重试的判断逻辑:

canRetry()

 /**
 * 是否能重试?
 * 1.没有到投递的超时时间。
 * 2.batch重试次数没超过设定的次数。
 * 3.批次没结束。
 * 4.如果不被事务管理响应的异常属于可重试的异常,如果被事务管理调用事务管理器自定义的判断器判断是否能重试
 */

private boolean canRetry(ProducerBatch batch, ProduceResponse.PartitionResponse response, long now) {
    return !batch.hasReachedDeliveryTimeout(accumulator.getDeliveryTimeoutMs(), now) &&
        batch.attempts() < this.retries &&
        !batch.isDone() &&
        (transactionManager == null ?
                response.error.exception() instanceof RetriableException :
                transactionManager.canRetry(response, batch));
}
复制代码

写在最后

本人在掘金发布了小册,对kafka做了源码级的剖析。

欢迎支持笔者小册:《Kafka 源码精讲

猜你喜欢

转载自juejin.im/post/7109101911877353486