5.Kafka系列之设计思想(三)-生产者与消费者

本文我们了解Kafka官网关于生产者、消费者设计思想

4.4 The Producer

Load balancing负载均衡

The producer sends data directly to the broker that is the leader for the partition without any intervening routing tier. To help the producer do this all Kafka nodes can answer a request for metadata about which servers are alive and where the leaders for the partitions of a topic are at any given time to allow the producer to appropriately direct its requests.

生产者直接将数据发送到作为分区领导者的代理，而无需任何中间路由层。为了帮助生产者做到这一点，所有 Kafka 节点都可以在任何给定时间回答有关哪些服务器处于活动状态以及主题分区的领导者所在的元数据请求，以允许生产者适当地定向其请求

The client controls which partition it publishes messages to. This can be done at random, implementing a kind of random load balancing, or it can be done by some semantic partitioning function. We expose the interface for semantic partitioning by allowing the user to specify a key to partition by and using this to hash to a partition (there is also an option to override the partition function if need be). For example if the key chosen was a user id then all data for a given user would be sent to the same partition. This in turn will allow consumers to make locality assumptions about their consumption. This style of partitioning is explicitly designed to allow locality-sensitive processing in consumers.

客户端控制将消息发布到哪个分区。这可以随机完成，实现一种随机负载平衡，也可以通过某种语义分区函数来完成。我们通过允许用户指定一个键来进行分区并使用它来散列到一个分区来公开语义分区的接口（如果需要，还有一个选项可以覆盖分区函数）。例如，如果选择的键是用户 ID，则给定用户的所有数据都将发送到同一分区。这反过来将允许消费者对他们的消费做出当地假设。这种分区方式明确设计为允许在消费者中进行位置敏感处理

Asynchronous send异步发送

Batching is one of the big drivers of efficiency, and to enable batching the Kafka producer will attempt to accumulate data in memory and to send out larger batches in a single request. The batching can be configured to accumulate no more than a fixed number of messages and to wait no longer than some fixed latency bound (say 64k or 10 ms). This allows the accumulation of more bytes to send, and few larger I/O operations on the servers. This buffering is configurable and gives a mechanism to trade off a small amount of additional latency for better throughput

批处理是效率的重要驱动因素之一，为了启用批处理，Kafka 生产者将尝试在内存中累积数据并在单个请求中发送更大的批次。批处理可以配置为累积不超过固定数量的消息，并且等待时间不超过某个固定的延迟限制（比如 64k 或 10 毫秒）。这允许积累更多的字节来发送，并且在服务器上很少有更大的 I/O 操作。这种缓冲是可配置的，并提供了一种机制来权衡少量的额外延迟以获得更好的吞吐量。

4.5 The Consumer消费者

The Kafka consumer works by issuing “fetch” requests to the brokers leading the partitions it wants to consume. The consumer specifies its offset in the log with each request and receives back a chunk of log beginning from that position. The consumer thus has significant control over this position and can rewind it to re-consume data if need be

Kafka 消费者通过向引导它想要消费的分区的代理发出“获取”请求来工作。消费者在每个请求中指定其在日志中的偏移量，并从该位置开始接收回一大块日志。因此，消费者对该位置有很大的控制权，并且可以在需要时倒带它以重新使用数据。

Push vs. pull推与拉

An initial question we considered is whether consumers should pull data from brokers or brokers should push data to the consumer. In this respect Kafka follows a more traditional design, shared by most messaging systems, where data is pushed to the broker from the producer and pulled from the broker by the consumer. Some logging-centric systems, such as Scribe and Apache Flume, follow a very different push-based path where data is pushed downstream. There are pros and cons to both approaches. However, a push-based system has difficulty dealing with diverse consumers as the broker controls the rate at which data is transferred. The goal is generally for the consumer to be able to consume at the maximum possible rate; unfortunately, in a push system this means the consumer tends to be overwhelmed when its rate of consumption falls below the rate of production (a denial of service attack, in essence). A pull-based system has the nicer property that the consumer simply falls behind and catches up when it can. This can be mitigated with some kind of backoff protocol by which the consumer can indicate it is overwhelmed, but getting the rate of transfer to fully utilize (but never over-utilize) the consumer is trickier than it seems. Previous attempts at building systems in this fashion led us to go with a more traditional pull model

我们最初考虑的一个问题是，消费者是应该从经纪人那里提取数据，还是经纪人应该将数据推送给消费者。在这方面，Kafka 遵循大多数消息系统共享的更传统的设计，其中数据从生产者推送到代理，并由消费者从代理拉取。一些以日志为中心的系统，例如Scribe和 Apache Flume，遵循一个非常不同的基于推送的路径，其中数据被推送到下游。两种方法各有利弊。然而，基于推送的系统难以处理不同的消费者，因为经纪人控制数据传输的速率。目标通常是让消费者能够以最大可能的速度消费；不幸的是，在推送系统中，这意味着当消费者的消费率低于生产率时，消费者往往会不知所措（本质上是拒绝服务攻击）。基于拉动的系统具有更好的特性，即消费者可以简单地落后并在可能的时候赶上来。这可以通过某种退避协议来缓解，消费者可以通过这种退避协议表明它不堪重负，但是让传输速率充分利用（但绝不会过度利用）消费者比看起来要棘手。以前以这种方式构建系统的尝试使我们采用了更传统的拉模型

Another advantage of a pull-based system is that it lends itself to aggressive batching of data sent to the consumer. A push-based system must choose to either send a request immediately or accumulate more data and then send it later without knowledge of whether the downstream consumer will be able to immediately process it. If tuned for low latency, this will result in sending a single message at a time only for the transfer to end up being buffered anyway, which is wasteful. A pull-based design fixes this as the consumer always pulls all available messages after its current position in the log (or up to some configurable max size). So one gets optimal batching without introducing unnecessary latency

基于拉动的系统的另一个优点是它有助于将发送给消费者的数据积极地分批处理。基于推送的系统必须选择立即发送请求或积累更多数据然后在不知道下游消费者是否能够立即处理它的情况下发送它。如果针对低延迟进行了调整，这将导致一次只发送一条消息，但无论如何传输最终都会被缓冲，这是一种浪费。基于拉取的设计解决了这个问题，因为消费者总是在其在日志中的当前位置（或达到某个可配置的最大大小）之后拉取所有可用消息。因此，可以在不引入不必要延迟的情况下获得最佳批处理

The deficiency of a naive pull-based system is that if the broker has no data the consumer may end up polling in a tight loop, effectively busy-waiting for data to arrive. To avoid this we have parameters in our pull request that allow the consumer request to block in a “long poll” waiting until data arrives (and optionally waiting until a given number of bytes is available to ensure large transfer sizes)

基于拉取系统的不足之处在于，如果代理没有数据，消费者可能会不断轮询，忙于等待数据到达。为避免这种情况，我们在拉取请求中设置了参数，允许消费者请求在“长轮询”中阻塞，等待数据到达（并可选择等待给定字节数可用以确保较大的传输大小）

You could imagine other possible designs which would be only pull, end-to-end. The producer would locally write to a local log, and brokers would pull from that with consumers pulling from them. A similar type of “store-and-forward” producer is often proposed. This is intriguing but we felt not very suitable for our target use cases which have thousands of producers. Our experience running persistent data systems at scale led us to feel that involving thousands of disks in the system across many applications would not actually make things more reliable and would be a nightmare to operate. And in practice we have found that we can run a pipeline with strong SLAs at large scale without a need for producer persistence

您可以想象其他可能的设计，它们只是拉取的、端到端的。生产者将在本地写入本地日志，而经纪人将从中提取数据，而消费者则从中提取数据。通常会提出一种类似类型的“存储转发”生产者。这很有趣，但我们觉得不太适合我们拥有数千个生产者的目标用例。我们大规模运行持久性数据系统的经验让我们感到，在系统中涉及许多应用程序的数千个磁盘实际上不会使事情变得更可靠，而且操作起来会是一场噩梦。在实践中，我们发现我们可以大规模运行具有强大 SLA 的管道，而无需生产者持久性

Consumer Position消费位置

Keeping track of what has been consumed is, surprisingly, one of the key performance points of a messaging system.

跟踪哪些消息被消费了是消息传递系统的关键点之一

Most messaging systems keep metadata about what messages have been consumed on the broker. That is, as a message is handed out to a consumer, the broker either records that fact locally immediately or it may wait for acknowledgement from the consumer. This is a fairly intuitive choice, and indeed for a single machine server it is not clear where else this state could go. Since the data structures used for storage in many messaging systems scale poorly, this is also a pragmatic choice–since the broker knows what is consumed it can immediately delete it, keeping the data size small

大多数消息传递系统都保留有关在代理上使用了哪些消息的元数据。也就是说，当消息被分发给消费者时，代理要么立即在本地记录该事实，要么等待消费者的确认。这是一个相当直观的选择，而且对于单机服务器来说确实不清楚这个状态还能去哪里。由于许多消息传递系统中用于存储的数据结构扩展性很差，这也是一个务实的选择——因为代理知道消费了什么，它可以立即删除它，从而保持数据量较小

What is perhaps not obvious is that getting the broker and consumer to come into agreement about what has been consumed is not a trivial problem. If the broker records a message as consumed immediately every time it is handed out over the network, then if the consumer fails to process the message (say because it crashes or the request times out or whatever) that message will be lost. To solve this problem, many messaging systems add an acknowledgement feature which means that messages are only marked as sent not consumed when they are sent; the broker waits for a specific acknowledgement from the consumer to record the message as consumed. This strategy fixes the problem of losing messages, but creates new problems. First of all, if the consumer processes the message but fails before it can send an acknowledgement then the message will be consumed twice. The second problem is around performance, now the broker must keep multiple states about every single message (first to lock it so it is not given out a second time, and then to mark it as permanently consumed so that it can be removed). Tricky problems must be dealt with, like what to do with messages that are sent but never acknowledged.

可能不明显的是，让经纪人和消费者就已消费的内容达成一致并不是一个微不足道的问题。如果代理在每次通过网络分发消息时立即将消息记录为已消费，那么如果消费者未能处理该消息（比如因为它崩溃或请求超时或其他原因），该消息将丢失。为了解决这个问题，很多消息系统都增加了确认功能，即消息在发送时只标记为已发送而不是被消费；代理等待来自消费者的特定确认以将消息记录为已消费. 这种策略解决了丢失消息的问题，但又产生了新的问题。首先，如果消费者处理消息但在发送确认之前失败，那么消息将被消费两次。第二个问题是关于性能的，现在代理必须保持关于每条消息的多个状态（首先锁定它以免它被第二次发出，然后将其标记为永久消费以便它可以被删除）。必须处理棘手的问题，例如如何处理已发送但从未确认的消息

Kafka handles this differently. Our topic is divided into a set of totally ordered partitions, each of which is consumed by exactly one consumer within each subscribing consumer group at any given time. This means that the position of a consumer in each partition is just a single integer, the offset of the next message to consume. This makes the state about what has been consumed very small, just one number for each partition. This state can be periodically checkpointed. This makes the equivalent of message acknowledgements very cheap.

Kafka 以不同的方式处理这个问题。我们的主题分为一组完全有序的分区，每个分区在任何给定时间都由每个订阅消费者组中的一个消费者使用。这意味着消费者在每个分区中的位置只是一个整数，即下一条要消费的消息的偏移量。这使得关于已消耗内容的状态非常小，每个分区只有一个数字。可以定期检查此状态。这使得消息确认的等价物非常便宜

There is a side benefit of this decision. A consumer can deliberately rewind back to an old offset and re-consume data. This violates the common contract of a queue, but turns out to be an essential feature for many consumers. For example, if the consumer code has a bug and is discovered after some messages are consumed, the consumer can re-consume those messages once the bug is fixed

这个决定有一个附带的好处。消费者可以故意回退到旧的偏移量并重新消费数据。这违反了队列的共同约定，但事实证明这是许多消费者的基本特征。比如消费者代码有bug，在消费了一些消息后被发现，修复bug后消费者可以重新消费这些消息

Offline Data Load离线数据加载

Scalable persistence allows for the possibility of consumers that only periodically consume such as batch data loads that periodically bulk-load data into an offline system such as Hadoop or a relational data warehouse.

可扩展的持久性允许消费者只定期消费的可能性，例如批量数据加载，周期性地将数据批量加载到离线系统，如 Hadoop 或关系数据仓库

In the case of Hadoop we parallelize the data load by splitting the load over individual map tasks, one for each node/topic/partition combination, allowing full parallelism in the loading. Hadoop provides the task management, and tasks which fail can restart without danger of duplicate data—they simply restart from their original position.

在 Hadoop 的情况下，我们通过将负载拆分到各个映射任务来并行化数据加载，每个映射任务对应一个节点/主题/分区组合，从而允许加载中的完全并行。Hadoop 提供了任务管理，失败的任务可以重新启动而没有重复数据的危险——它们只是从原来的位置重新启动

Static Membership静态成员

Static membership aims to improve the availability of stream applications, consumer groups and other applications built on top of the group rebalance protocol. The rebalance protocol relies on the group coordinator to allocate entity ids to group members. These generated ids are ephemeral and will change when members restart and rejoin. For consumer based apps, this “dynamic membership” can cause a large percentage of tasks re-assigned to different instances during administrative operations such as code deploys, configuration updates and periodic restarts. For large state applications, shuffled tasks need a long time to recover their local states before processing and cause applications to be partially or entirely unavailable. Motivated by this observation, Kafka’s group management protocol allows group members to provide persistent entity ids. Group membership remains unchanged based on those ids, thus no rebalance will be triggered.

静态成员旨在提高流应用程序、消费者组和其他构建在组再平衡协议之上的应用程序的可用性。再平衡协议依赖组协调器将实体 ID 分配给组成员。这些生成的 ID 是短暂的，并且会在成员重新启动和重新加入时更改。对于基于消费者的应用程序，这种“动态成员资格”可能会导致在代码部署、配置更新和定期重启等管理操作期间将大部分任务重新分配给不同的实例。对于大型状态应用程序，混洗任务需要很长时间才能在处理之前恢复其本地状态，并导致应用程序部分或完全不可用。受到这一观察的启发，Kafka 的组管理协议允许组成员提供持久的实体 ID。基于这些 id，组成员资格保持不变，因此不会触发重新平衡

If you want to use static membership,
如果你想使用静态成员资格，

Upgrade both broker cluster and client apps to 2.3 or beyond, and also make sure the upgraded brokers are using inter.broker.protocol.version of 2.3 or beyond as well.
将代理集群和客户端应用程序升级到 2.3 或更高版本，并确保升级后的代理也在使用inter.broker.protocol.version 2.3 或更高版本
Set the config ConsumerConfig#GROUP_INSTANCE_ID_CONFIG to a unique value for each consumer instance under one group.
为一组下的每个消费者实例将配置设置ConsumerConfig#GROUP_INSTANCE_ID_CONFIG为唯一值
For Kafka Streams applications, it is sufficient to set a unique + ConsumerConfig#GROUP_INSTANCE_ID_CONFIG per KafkaStreams instance, independent of the number of used threads for an instance.
对于 Kafka Streams 应用程序，为每个 KafkaStreams 实例设置一个唯一的就足够了ConsumerConfig#GROUP_INSTANCE_ID_CONFIG，与实例使用的线程数无关

欢迎关注公众号算法小生

5.Kafka系列之设计思想(三)-生产者与消费者

4.4 The Producer

猜你喜欢