kafka 总结

kafka-零字节拷贝
1.数据从内核复制到套接字缓冲区
2.从套接字缓冲区复制到NIC(网络适配器)缓冲区--网络传输

传统: 
1.数据从磁盘读取到内核空间的pagecache中
2.应用程序从内核空间读取数据到用户空间缓冲区
3.应用程序将数据从内核空间复制到套接字缓冲区
4.从套接字缓冲区复制到NIC(网络适配器)缓冲区

Spark Streaming + Kafka 整合

Receiver-based Approach
1.Kafka 的topic分区 和 Spark Streaming 中生成的RDD分区没有关系.
2.KafkaUtils.createStream中增加的分区数量智慧增加单个receiver 的线程数,不会增加spark 的并行度.
3.可以创建多个Kafka的输入DStream,使用不同的group和topic,使用多个receiver 并行接受数据(提高spark并行度)
4.如果启动hdfs等容错性存储系统,并启用写入日志,则接收到的数据已经被复制到日志中.
因此,输入流的存储级别设置StorageLevel.MEMORY_AND_DISK_SER (即使用KafkaUtils.createStream(...,Storage.MEMORY_AND_DISK_SER)的存储级别

Direct Approach (No receivers: 直连方式,无receiver进行接收消息)
简化的并行性:不需要创建多个输入kafka流 并将其合并.使用directStream,Spark Streaming 将创建与使用Kafka分区一样多的RDD分区(提供1:1的kakfa,Rdd分区),这些分区将全部从kfaka并行读取数据.所以在kafka和RDD分区之间有一对一的映射关系; RDD分区 和 kafka分区一一对应


import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.kafka.common.serialization.StringDeserializer
import org.apache.spark.streaming.kafka010._
import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent
import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe


val kafkaParams = Map[String,Object](
           "bootstrap.servers"->"localhost:9092,anotherhost:9092"
           "key.deserializer"->classOf[StringDeserializer], //使用 StringDeserializer进行key,value反序列化
           "val.deserializer"->classOf[StringDeserializer],
           "group.id"->"为每个stream使用一个分割group_id",
           "auto.offset.reset"->"latest", //自动重置为最新偏移量
            "enable.auto.commit"->(false:java.lang.Boolean)
)


val topics = Array("topicA","topicB")
val stream = KafkaUtils.createDirectStream[String,String](
        streamingContext,
        PreferConsistent,
        Subscribe[String,String](topics,KafkaParams)
)


stream.map(record =>(record.key,record.value))

创建定义offset范围的RDD,用于批处理
val  offsetRanges = Array(
      //topic,  partition,inclusive offset,exclusive ending offset
      OffsetRange("test",0,0,100)  //从0分区,读取offset 0-99
      OffsetRange("test",1,0,100)  //从1分区,读取offset 0-99

)

val  rdd = KafkaUtils.createRDD[String,String](sparkContext,kafkaParams,offsetRanges,PreferConsistent)


Obtaining Offsets

stream.foreachRDD{ rdd =>
       val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
       rdd.foreachPartition{ iter=>
            val o:OffsetRange = offsetRanges(TaskContext.get.partitionId)
            println(s"${o.topic},${o.partition},${o.fromOffset}${o.untilOffset}")
       }
}

效率: 在第一种方法中实现另数据丢失时,需要将数据存储在预写日志中,这回进一步复制数据.实际是效率地下--数据被复制两次,一次是kafka,另一次写入预写日志(Write Ahead Log)复制.直连方式消除了这个方式,无Receiver,不需要预先写 预写日志(WAL),前提是 kafka数据保留时间足够长.


Exactly-once:
1-Receiver: 使用kafka的高级API来在Zookeeper中存储消耗的偏移两,传统上这是kafkfa消费数据方式.可以通过结合WAL来确保零数据丢失(at least once),但是存在失败情况下,消息被重复消费的问题.发生这种情况是因为Spark Streaming 可靠接收的数据与Zookeeper 跟踪的偏移之间的不一致.
2-direct: 不使用zookeeper跟踪消费记录的偏移量,在其检查点内,spark Streaming 跟踪偏移量.这消除了Spark Streaming 和 zookeeper 的读取和跟踪offset不一致.因此Spark Streaming每次记录都会在发生故障的情况下有效收到一次.为了在发生故障下也能保证输出结果的一次语义,讲数据保存到外部数据存储区的输出操作必须是幂等(每次输出都是相同数据)
或者保存结果和偏移量的原子事务.


使用外部存储保存offset 
(一) checkpoint
1.启用Spar Streaming 的checkpoint 是存储偏移量最简单的方法(仍然会不一致:记录偏移量后,未成功处理)
缺点: 
1.Spark无法跨应用程序进行恢复
2.Spark升级将无法导致恢复
3.在关键生产应用,不建议使用spark 检查管理offset(软件升级,无法跨应用)

2.流式checkpoint专门用户保存应用程序的状态,比如保存HDFS上,在故障时能恢复
与zk,hbase相比,hdfs有更高延迟;如管理不当,在hdfs上写入每个批次的offsetRanges可能会导致小文件问题


(二).Hbase
1.基于Hbase 的通用设计,使用同一张标保存可以跨越多个spark streaming 的程序topic的offset

2.rowkey = topicName + groupid + batchTiemOfstreaming (miliSeconds)  :尽管batchtime.miliSeconds不是必须的,但是可以看到历史的批处理任何对offset的管理情况.

3.kafka 的offset保存在如下标,30天自动过期
create  'spark_kafka_offsets',{NAME=>'offsets',TTL=>2592000}


4.offset获取场景
场景1:Streaming作业首次启动,通过zookeeper来查找给定topic中分区数量,然后返回"0"作为所有topic分区的offset

场景2: 长时间运行的Streaming作业已经停止,新的分区被添加到kafka 的topic中,通过zookeeper 来查找给顶topic中分区的数量;对于所有旧的topic分区,将offset 设置为HBase 中的最新偏移量,对于所有新的topic,她将分会"0"作为offset.

场景3:长时间运行的Streaming已经停止,topic没有任何更改.此种情况下,Hbase中发现最新的偏移量作为每个topic分区的offset返回.


hbase (main):009:0 >  scan  'spark_kafka_offsets'


stream.foreachRDD{
  rdd =>
  rdd.asInstanceOf[HasOffsetRanges].offsetRanges
  
 //some tim later,after outputs have completed
  stream.asInstanceOf[CanCommitOffsets].commitAsync(OffsetRanges)

}

(三).Zookeeper

1.路径:
val  zkPath = s"${kafkaOffsetRootPath}/${groupName}/${o.topic}/${o.partition}"

2.如果zookeeper中未保存offset,根据kafkaParam 配置使用最新或者最旧的offset


3.如果zookeeper中保存offset 我们会利用这个offset作为kfakaStream的起始位置
zkCli 登陆
ls /kafka0.9/mykafka/consumer/offsets/testp/mytest1

get /kafka0.9/mykafka/consumer/offsets/testp/mytest1/0


缺点:如果是hadoop,hive,spark,hbase都是集群方式部署,依赖zookeeper,就会给本来负载较大的zookeeper带来更大的压力,容易造成zookeeper故障,影响集群正常工作

(四).kafka
kafka自身通过 enable.auto-commit 参数定期体检offset,可以保证offset存储
但是仍有问题:当自定义批处理业务未成功产生spark operation,消息读取发生污染,导致一个未定义语义.所以spark默认该功能禁用(enable.auto-commit=false)
当然可以可以使用commitAsync API,与checkpoint相比的好处是,kafka 是定期,无视应用代码改变(checkpoint敏感代码改变,重新指定offset)保存offset,但是kafka非事务处理,必须仍然像
checkpoint保证幂等性.


stream.foreachRDD { rdd =>
       val offsetRanges = rdd.asInstanceOf[HashOffsetRanges].offsetRanges 
       
       //some time later,after outputs have completed 
       stream.asInstanceOf[CanCommitOffsets].commitAsync(offsetRanges)  //CanCommitOffsets仅在 CreateStream结果上调用返回成功,而非transformation之后
// commitAsync调用是线程安全,但是必须发生在 createStream结果输出之后,如果要有意义语义.


}

(五)自身业务存储:
对于支持事务处理的数据存储,在相同的事务中保存offset,即使在fail 情况下,仍然能够保持两者同步,一致.
如果你不关心探测重复,跳跃的offset 范围,回滚事务阻止重复消息提交,丢失消息影响.此能保证 exactly once semantic (仅且一次语义)
对于聚合操作结果往往很难保证幂等性, 使用此策略为聚合操作结果也是可能的.

//the details depend on your data store ,but the general ideal look this,但这都是理想情况,很少消息数据是需要事务支持的

//begin from the offsets commited to the database

val fromOffsets = selectOffsetsFromYourDatabase.map{ resultSet=>   //offset  topic+partition+batchTimeOfMiliseconds
       new TopicPartition(resultSet.string("topic"),resultSet.int("partition")) -> resultSet.long("offset")
}.toMap

val stream = KafkaUtils.createDirectStream[String,String](
           streamingContext,
           PreferConsistent,
           Assign[String,String](fromOffsets.keys.toList,kafkaParams,fromOffsets)
)


stream.foreachRDD{ rdd =>
       val offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
       val results = yourCalculation(rdd)

       //begin transaction
 
       //insert,update results
 
      // update offset where the end of existing offsets matches the begining of this batch of offset

      //assert that offsets were updated correctly

    
      //end your transaction


SSL/TLS :spark 与 kafka 之间的communication
val kafkaParams = Map[String,Object](
        //the usual params ,make sure to change the port in bootstrap.servers if 9002 is not TLS
        "security.protocol" -> "SSL"
        "ssl.truststore.location" -> "/some-directory/kafka.client.truststore.jks"
        "ssl.truststore.password" -> "test1234"
        "ssl.keystore.location" -> "/some-directory/kafka.client.keystore.jks"
        "ssl.keystore.password" -> "test1234"
        "ssl.key.password" -> "test1234"

)
    
        
}       


(六) 不保存kafka offset :容忍部分数据丢失


(七)根据业务需求是否管理offset
1. 如实时活动监控,只需当前最新数据,不需要管理offset;此情况下使用之前 Low-level API,将参数 auto.offset.reset 设置为largest或者smallest.新的API:auto.offset.reset="earliest"


一个brokder 对应一个brokerid, server.properties


kafka 安装,一个机器可以对应一个或多个broker,根据需求,生产业务进行制定

1. 下载spark,hdfs相匹配的kafka版本

2.解压

tar xzvf   kafka.tar.gz


3.解压安装(简略) ,配置好zookeeper zoo.config

创建zk data,log目录
sudo mkdir /usr/cdh/spark/zkdata/
sudo mkdir /usr/cdh/spark/zkdata/zklogs

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.


#zookeeper logs
dataDir=/usr/cdh/zookeeper/data/


#hadoop  zk  logs
dataDir=/usr/cdh/hadoop/zkdata
dataLogDir=/usr/cdh/hadoop/zkdata/zklogs


#spark zk logs
dataDir=/usr/cdh/spark/zkdata/
dataLogDir=/usr/cdh/spark/zkdata/zklogs

# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1


#assign   servral hostname:port to server.id
# 以下配置在集群(多台机器) 中
#server.0=Master:2888:3888 
#server.1=Worker1:2888:3888
#server.2=Worker2:2888:3888

4.对应用户下,在配置文件中配置kafka对应环境变量
vim .profile  

export KAFKA_HOME=/usr/cdh/kafka
export PATH=$PATH:$KAFKA_HOME/bin:$SCALA_HOME/bin:$JAVA_HOME/bin

生效配置
.  .profile

5.配置kafka broker 配置文件(一个配置文件对应一个broker)


每个 server.properties  的  broker.id,listeners , logdirs 不能相同


server.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id  ,kafka集群不可有重复id,否则重复的无法启动
broker.id=0

############################# Socket Server Settings #############################

#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动
listeners=PLAINTEXT://:9092


# The port the socket server listens on
#port=9092

# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost

# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured.  Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=<hostname routable by clients>

# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=<port accessible by clients>

# The number of threads handling network requests
num.network.threads=3

# The number of threads doing disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题
log.dirs=/tmp/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起
zookeeper.connect=hadoop:2181/kafka0.9

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000


同一台机器第二个broker  server1.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id  ,kafka集群不可有重复id,否则重复的无法启动
broker.id=1

############################# Socket Server Settings #############################

#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动
listeners=PLAINTEXT://:19092


# The port the socket server listens on
#port=9092

# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost

# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured.  Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=<hostname routable by clients>

# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=<port accessible by clients>

# The number of threads handling network requests
num.network.threads=3

# The number of threads doing disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题
log.dirs=/tmp/kafka-logs1

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起
zookeeper.connect=hadoop:2181/kafka0.9

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000

不同机器第二个  broker server.properties 配置
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
# 每一个broker对应一个broker.id  ,kafka集群不可有重复id,否则重复的无法启动
broker.id=1

############################# Socket Server Settings #############################

#broker 如果存在同一台机器broker ,则每个broker监听端口不能相同,否则第二个重复broker.id无法启动; 不同台机器可以一致
listeners=PLAINTEXT://:9092


# The port the socket server listens on
#port=9092

# Hostname the broker will bind to. If not set, the server will bind to all interfaces
#host.name=localhost

# Hostname the broker will advertise to producers and consumers. If not set, it uses the
# value for "host.name" if configured.  Otherwise, it will use the value returned from
# java.net.InetAddress.getCanonicalHostName().
#advertised.host.name=<hostname routable by clients>

# The port to publish to ZooKeeper for clients to use. If this is not set,
# it will publish the same port that the broker binds to.
#advertised.port=<port accessible by clients>

# The number of threads handling network requests
num.network.threads=3

# The number of threads doing disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma seperated list of directories under which to store log files
# 存储kakfa 消息文件的目录,非常重要;同样如果一台机器有多个broker,该目录不能相同,否则仍是上述问题; 不同台机器可以一致
log.dirs=/tmp/kafka-logs 

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
#存储为kafka连接zookeeperurl,推荐应该在url后面多加个目录,这样可以将kafka各组件目录放在一起;otherhostname:不同台机器ip或者hostname
zookeeper.connect=otherhostname:2181/kafka0.9  

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000


6.启动kafka  


kafka-server-stop.sh  /usr/cdh/kafka/config/server.properties 

kafka-server-stop.sh  /usr/cdh/kafka/config/server1.properties 

关闭:
kafka-server-stop.sh  /usr/cdh/kafka/config/server.properties 
 

7.kafka 操作

(a) 创建broker

kafka-topics.sh   --zookeeper hadoop:2181/kafka0.9  --create --topic  mykafka    --replication-factor 2  --partitions 3 

创建broker 复本 不能大于broker数量,否则报错

 kafka-topics.sh   --zookeeper hadoop:2181/kafka0.9  --create --topic  mykafka    --replication-factor 3  --partitions 3 
Error while executing topic command : replication factor: 3 larger than available brokers: 2
[2018-04-05 19:31:35,515] ERROR kafka.admin.AdminOperationException: replication factor: 3 larger than available brokers: 2
    at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:77)
    at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:236)
    at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:105)
    at kafka.admin.TopicCommand$.main(TopicCommand.scala:60)
    at kafka.admin.TopicCommand.main(TopicCommand.scala)
 (kafka.admin.TopicCommand$)


(c) 查看broker列表 (具体细节参数:可以使用 kafka-topics.sh  --help 命令) 
kafka-topics.sh   --zookeeper hadoop:2181/kafka0.9   --list

mykafka

(c) 删除broker
kafka-topics.sh   --zookeeper hadoop:2181/kafka0.9   --delete  --topic   mykafka 

zkCli.sh  删除broker在zookeeper上元数据信息
[zk: localhost:2181(CONNECTED) 13] ls /kafka0.9/brokers/topics
[mykafka]
[zk: localhost:2181(CONNECTED) 14] rmr  /kafka0.9/brokers/topics/mykafka

再次查看
spark@hadoop:~$ kafka-topics.sh   --zookeeper hadoop:2181/kafka0.9   --list


(d) kafka broker创建错误,直接删除不需要进行更新,修改,麻烦易出错


(e) 创建生产者
spark@hadoop:~$ kafka-console-producer.sh    --broker-list   hadoop:9092  hadoop:19092 --topic  mykafka
输入以下内容,可以在消费者console下看到
df
jack
mary

(f) 创建消费者
kafka-console-consumer.sh   --zookeeper hadoop:2181/kafka0.9 --topic mykafka
df
jack
mary


kafk 数据存储方式
逻辑上是以topic 进行数据存储,按照主题进行数据划分,topic管理对应的partition数据
物理上是以partition进行进行存储,partition目录下面有segment,segement目录下面对应有 *.index ,*.log ;
*.index :kafka通过索引文件去找对应的消息,*.log: 存储的是具体的数据;如下

每个partition 命名规则: topic+有序序号 (从0开始,最大序号 为partition数量 减1)
drwxr-xr-x  2 spark hadoop  4096 4月   5 20:01 mykafka-0/
drwxr-xr-x  2 spark hadoop  4096 4月   5 20:01 mykafka-1/
drwxr-xr-x  2 spark hadoop  4096 4月   5 20:01 mykafka-2/


图片填充

数据查找方式: 
1.首先数据排序方式是全局有序的,不论位于哪台机器,那个partition,消息都是全局有序的
2.数据文件存储数据是从0开始计算偏移地址,所以对于00000000000000000000.log,00000000000000017410.log,000000000000000239430.log
00000000000000000000.log 对应的起始偏移地址为 0,终止为17410
00000000000000017410.log 对应的其实偏移地址为 17410+1 
00000000000000034823.log 对应的起始偏移地址为 239430+1 

消息读取检索机制:
 
1.首先找到对应topic,然后根据topic找到对应的partition下的目录
2.根据相应的offset ,读取对应的*.index 索引文件的,找到对应的*.log文件
3.然后通过*.log找到 对应的全局partition中初始消息偏移量,使用offset与初始消息地址相加(offset+初始消息偏移量)即可得到对应的消息数据所在地址
4.如果该偏移量大于索引文件的最大值,则进行下一个索引的查找;如果不大于,说明该消息数据在该索引文件对应的数据文件内
5.从对应的索引文件找到该消息数据在对应log文件中的顺序,继而找到该消息的物理偏移地址.

如下读取消息 [3,348]  --3: 索引文件对应的在对应数据偏移量为3,即第三个消息; 对应数据文件*.log 对应的全局偏移量: 170410+3 

1.首先根据对应offset=170418 查找索引文件文件
2.第一次一第一个索引文件开始00000000000000000000.index开始,第二个索引文件为,00000000000000017410.log(起始偏移为239430+1)
,第三个文件为000000000000000239430.log(起始偏移位置为239430+1) ,所以offset174018 就落在了第二个文件中
3.然后根据offset=170418 在第二个索引文件中找到对应的物理偏移量,继而读取对应该offset的消息数据


Parttion
1.一个topic 按照多个分区组织消息
2.增加partition数量,可以提高读写并发
3.一个partition 对应物理文件由多个segment组成,而segment对应的有:  log文件,和index文件,时间戳文件;每个log文件又称作segment文件(每个segment文件消息数量不一致,这种特性也方便old segment 删除,及方便对已被消费的消息进行清理,提高磁盘利用率,每个parathion支持顺序读写就行,segment文件生命周期有服务端参数(log.segment.bytes,log.roll.{ms,hours}等决定)
4.一个partition可以制定多个副本,但是只有一个副本是leader
5.partition的读写只能通过leader
6.segment文件命名规范:文件第一条消息 : offset-1


Message

一个消息对应一个offset
消息只会追加到segment上,无法修改删除
segment定期删除(配置项:log.segment.bytes,log.roll.{ms,hours},retention.byte) 7天删除


offset 是一个有序序列,最大长度8字节

Consumer Group
1.一个消费组可以有多个消费实例
2.消费足之间独立消费,互不影响
3.消费組中小费实例并行消费,不会重复
4.根据parathion数量,为消费組分配合理的消费組实例数量,保证合理并发数(既不闲置,又不过载)

Hive level API
1.不需要自己管理offset
2.模式实现at least once 语义  
3.consumer 数量大于partition数量,浪费资源(虽然浪费,但是可支撑业务)
4.consumer 数量小于parathion数量,一个consumer对应多个parathion,consumer实例消费过载
5.最好partition 数目是consumer 数目的整数倍


Low level consumer API
需要自己管理offset
可以实现各种消息传输语义

消息组织 :


0 : 不确认消息发送是否成功,不需要返回消息确认信息
1 : 只需parathion的leader接受完成消息的 确认信息
-1 :需要 partition的leader ,follower接受确认信息

猜你喜欢

转载自blog.csdn.net/dymkkj/article/details/81278485