Flink Caused by:org.apache.flink.streaming.connectors.kafka.internal.Handover$ClosedException

The Flink program reads data from Kafka for calculations. The following error is reported as soon as the FLink program starts. It is very confusing to see the error. I didn’t work overtime until 9 o’clock. I came half an hour earlier the next day and read the following error message again. The specific error is as follows:

Error message 1.  

20/12/17 09:31:07 WARN NetworkClient: [Consumer clientId=consumer-14, groupId=qa_topic_flink_group] Error connecting to node 172.16.40.233:9092 (id: -3 rack: null)
java.nio.channels.ClosedByInterruptException
	at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
	at org.apache.kafka.common.network.Selector.doConnect(Selector.java:278)
	at org.apache.kafka.common.network.Selector.connect(Selector.java:256)
	at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:920)
	at org.apache.kafka.clients.NetworkClient.access$700(NetworkClient.java:67)
	at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:1090)
	at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.maybeUpdate(NetworkClient.java:976)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:533)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
	at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:292)
	at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1803)
	at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1771)
	at org.apache.flink.streaming.connectors.kafka.internal.KafkaPartitionDiscoverer.getAllPartitionsForTopics(KafkaPartitionDiscoverer.java:77)
	at org.apache.flink.streaming.connectors.kafka.internals.AbstractPartitionDiscoverer.discoverPartitions(AbstractPartitionDiscoverer.java:131)
	at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.open(FlinkKafkaConsumerBase.java:508)
	at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
	at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1007)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
	at java.lang.Thread.run(Thread.java:748)

Error message 2:

20/12/17 09:31:27 WARN StreamTask: Error while canceling task.
org.apache.flink.streaming.connectors.kafka.internal.Handover$ClosedException
	at org.apache.flink.streaming.connectors.kafka.internal.Handover.close(Handover.java:182)
	at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.cancel(KafkaFetcher.java:175)
	at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.cancel(FlinkKafkaConsumerBase.java:818)
	at org.apache.flink.streaming.api.operators.StreamSource.cancel(StreamSource.java:147)
	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cancelTask(SourceStreamTask.java:136)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.cancel(StreamTask.java:602)
	at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1355)
	at java.lang.Thread.run(Thread.java:748)
20/12/17 09:31:27 WARN StreamTask: Error while canceling task.
org.apache.flink.streaming.connectors.kafka.internal.Handover$ClosedException
	at org.apache.flink.streaming.connectors.kafka.internal.Handover.close(Handover.java:182)
	at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.cancel(KafkaFetcher.java:175)
	at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.cancel(FlinkKafkaConsumerBase.java:818)
	at org.apache.flink.streaming.api.operators.StreamSource.cancel(StreamSource.java:147)
	at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.cancelTask(SourceStreamTask.java:136)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.cancel(StreamTask.java:602)
	at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1355)
	at java.lang.Thread.run(Thread.java:748)

Solution: Our kafka topic is 9 partitions. Therefore, the parallelism in our Flink program should also be set to 9.

It is normal after running the program.

The follow-up is here again. . . . When we set the parallelism to 9, my data has to be divided into multiple side output streams,

When I wrote the last side output stream, I got the same error again . Then I tried various things.

Then I commented out all the side output streams and opened them one by one, and an error was reported every time the last side output stream was opened.

Very depressed! I began to doubt metaphysics. . . .

Finally, a sudden inspiration. Looking at the resource manager, I suspect that there is not enough memory.

But I found that an error was reported right after the program was started, and there was nothing in the memory. But when I started the program, I found the picture below. . . . . .

This picture shows the CPU resource usage. It is found that the CPU usage rate reaches 100% at the moment the program is started.

At this time, the CPU is suspected.

Sure enough, the degree of parallelism was modified to 3.

env.setParallelism(3);

Then restart the program and the normal CPU resource usage diagram is as follows:

The startup moment did not reach 100% and started normally. . . .

Happy. . .

Guess you like

Origin blog.csdn.net/zhangyupeng0528/article/details/111308323