- Automatically commit offsets
The following example code shows how to automatically commit the topic's offset:
public void
autoOffsetCommit
() {
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092" ) ;
props.put( "group.id" , "test" ) ;
props.put( "enable.auto.commit" , "true" ) ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;consumer.subscribe(Arrays. asList( "foo", "bar")) ; while ( true) { ConsumerRecords<String , String> records = consumer.poll( 100) ; for (ConsumerRecord<String , String> record : records) { System.
out .printf( "offset = %d, key = %s, value = %s%n" , record.offset() , record.key() , record.value()) ;
}
}
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092" ) ;
props.put( "group.id" , "test" ) ;
props.put( "enable.auto.commit" , "true" ) ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;consumer.subscribe(Arrays. asList( "foo", "bar")) ; while ( true) { ConsumerRecords<String , String> records = consumer.poll( 100) ; for (ConsumerRecord<String , String> record : records) { System.
out .printf( "offset = %d, key = %s, value = %s%n" , record.offset() , record.key() , record.value()) ;
}
}
}
The meaning of the key stored in the instance props of Properties:
1) bootstrap.servers represents the node in the Kafka cluster to be connected, where 9092 represents the port number;
2) When enable.auto.commit is true, it means that the offset of the topic will be automatically submitted after the time of auto.commit.interval.ms, where the default value of auto.commit.interval.ms is 5000ms;
3) Among them, foo and bar are the topic names to be consumed, and the group.id is test as the consumer group for unified management;
4) key.deserializer and value.deserializer specify that bytes are serialized into objects.
- Manually commit the offset
In the production environment, the offset needs to be submitted after the data consumption is complete, that is to say, after the data is taken out from the topic of Kafka and processed logically, the data is considered to be consumed. At this time, the offset of the topic needs to be submitted manually.
The following example code shows how to manually submit the offset of a topic:
public void
manualOffsetCommit
() {
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092" ) ;
props.put( "group.id" , "test" ) ;
props.put( "enable.auto.commit" , "false" ) ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.Stirng.Deserializer" ) ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;consumer.subscribe(Arrays. asList( "foo", "bar")) ; final int minBatchSize = 200;List<ConsumerRecord<String , String>> buffer = new ArrayList<>() ; while ( true) { ConsumerRecords<String ,
String> records = consumer.poll( 100 ) ;
for (ConsumerRecord<String , String> record : records) {
buffer.add(record) ;
}
if (buffer.size() >= minBatchSize) {
// operation to handle data
consumer.commitSync() ;
buffer.clear() ;
}
}
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092" ) ;
props.put( "group.id" , "test" ) ;
props.put( "enable.auto.commit" , "false" ) ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer" ) ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.Stirng.Deserializer" ) ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;consumer.subscribe(Arrays. asList( "foo", "bar")) ; final int minBatchSize = 200;List<ConsumerRecord<String , String>> buffer = new ArrayList<>() ; while ( true) { ConsumerRecords<String ,
String> records = consumer.poll( 100 ) ;
for (ConsumerRecord<String , String> record : records) {
buffer.add(record) ;
}
if (buffer.size() >= minBatchSize) {
// operation to handle data
consumer.commitSync() ;
buffer.clear() ;
}
}
}
The disadvantage of this scheme is that the offset of the topic must be submitted after all data has been processed. In order to avoid repeated consumption of data, the third scheme can be used to submit data according to the data consumption of each partition, which is called "at-least-once".
- Manually commit the offset of the partition
The following example code shows how to manually commit the offset of each partition in the topic:
public void
manualOffsetCommitOfPartition() {
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092") ;
props.put( "group.id" , "test") ;
props.put( "enable.auto.commit" , "false") ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer") ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.Stirng.Deserializer") ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;
consumer.subscribe(Arrays. asList( "foo" , "bar")) ;
boolean running = true;
try {
while (running) {
ConsumerRecords<String , String> records = consumer.poll(Long. MAX_VALUE) ;
for (TopicPartition partition : records.partitions()) {
List<ConsumerRecord<String , String>> partitionRecords = records.records(partition) ;
for (ConsumerRecord<String , String> record : partitionRecords) {
System. out.println(record.offset() + " : " + record.value()) ;
}
long lastOffset = partitionRecords.get(partitionRecords.size() - 1).offset() ;
consumer.commitSync(Collections. singletonMap(partition , new OffsetAndMetadata(lastOffset + 1))) ;
}
}
} finally {
consumer.close() ;
}
}
Properties props = new Properties() ;
props.put( "bootstrap.servers" , "localhost:9092") ;
props.put( "group.id" , "test") ;
props.put( "enable.auto.commit" , "false") ;
props.put( "key.deserializer" , "org.apache.kafka.common.serialization.StringDeserializer") ;
props.put( "value.deserializer" , "org.apache.kafka.common.serialization.Stirng.Deserializer") ;
KafkaConsumer<String , String> consumer = new KafkaConsumer<String , String>(props) ;
consumer.subscribe(Arrays. asList( "foo" , "bar")) ;
boolean running = true;
try {
while (running) {
ConsumerRecords<String , String> records = consumer.poll(Long. MAX_VALUE) ;
for (TopicPartition partition : records.partitions()) {
List<ConsumerRecord<String , String>> partitionRecords = records.records(partition) ;
for (ConsumerRecord<String , String> record : partitionRecords) {
System. out.println(record.offset() + " : " + record.value()) ;
}
long lastOffset = partitionRecords.get(partitionRecords.size() - 1).offset() ;
consumer.commitSync(Collections. singletonMap(partition , new OffsetAndMetadata(lastOffset + 1))) ;
}
}
} finally {
consumer.close() ;
}
}