本文基于OpenJDK 11
最近使用Spring Cloud Gateway
的时候,遇到了一个奇怪的问题:
线上有3个 API 网关实例,压力均衡,平稳运行3天后,突然有一个实例,CPU飚高,并且响应时间增加很多,从几十毫秒涨到了几分钟。
线上是 k8s 管理容器,立刻停掉了这个 pod,重建,恢复正常。
线上我们开启了 JFR
记录(可以参考我的另外系列文章:Java 监控 JFR),通过 JMC 查看下出问题的 JFR 记录。
首先我们来看 GC,我们的 GC 算法是 G1,主要通过 G1 Garbage Collection这个事件查看:
发现 GC 全部为 Young GC,且耗时比较正常,频率上也没有什么明显异常。
接下来来看,CPU 占用相关。直接看 Thread CPU Load 这个事件,看每个线程的 CPU 占用情况。发现reactor-http-epoll
线程池的线程,CPU 占用很高,加在一起,接近了 100%。
这些线程是 reactor-netty 处理业务的线程,观察其他实例,发现正常情况下,并不会有这么高的 CPU 负载。那么为啥会有这么高的负载呢?通过 Thread Dump 来看一下线程堆栈有何发现.
通过查看多个线程堆栈 dump,发现这些线程基本都处于 Runnable,并且执行的方法是原生方法,和StackWalker
相关,例如:
"reactor-http-epoll-2" #75 daemon prio=5 os_prio=0 cpu=25100145.64ms elapsed=306507.26s tid=0x0000556eddcbd000 nid=0x61 runnable [0x00007f8605443000]
java.lang.Thread.State: RUNNABLE
at java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk([email protected]/Native Method) at java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk([email protected]/StackStreamFactory.java:370) at java.lang.StackStreamFactory$AbstractStackWalker.walk([email protected]/StackStreamFactory.java:243) at java.lang.StackWalker.walk([email protected]/StackWalker.java:498) at org.apache.logging.log4j.util.StackLocator.calcLocation(StackLocator.java:81) at org.apache.logging.log4j.util.StackLocatorUtil.calcLocation(StackLocatorUtil.java:76) at org.apache.logging.log4j.spi.AbstractLogger.getLocation(AbstractLogger.java:2201) at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2144) at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2127) at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2020) at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1891) at org.apache.logging.log4j.spi.AbstractLogger.info(AbstractLogger.java:1436) at com.xxx.apigateway.filter.AccessCheckFilter.filter(AccessCheckFilter.java:144) at org.springframework.cloud.gateway.handler.FilteringWebHandler$GatewayFilterAdapter.filter(FilteringWebHandler.java:138) at org.springframework.cloud.gateway.filter.OrderedGatewayFilter.filter(OrderedGatewayFilter.java:44) at org.springframework.cloud.gateway.handler.FilteringWebHandler$DefaultGatewayFilterChain.lambda$filter$0(FilteringWebHandler.java:118) at org.springframework.cloud.gateway.handler.FilteringWebHandler$DefaultGatewayFilterChain$$Lambda$1265/0x0000000800b83440.get(Unknown Source) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:44) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.Mono.subscribe(Mono.java:4105) at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.drain(MonoIgnoreThen.java:172) at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:56) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:150) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:67) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:76) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:274) at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:851) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:114) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:67) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1637) at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:241) at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2199) at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:230) at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:54) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:150) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:114) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:76) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:274) at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:851) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:73) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:173) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1637) at reactor.core.publisher.MonoFilterWhen$MonoFilterWhenMain.innerResult(MonoFilterWhen.java:193) at reactor.core.publisher.MonoFilterWhen$FilterWhenInner.onNext(MonoFilterWhen.java:260) at reactor.core.publisher.MonoFilterWhen$FilterWhenInner.onNext(MonoFilterWhen.java:228) at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2199) at reactor.core.publisher.MonoFilterWhen$FilterWhenInner.onSubscribe(MonoFilterWhen.java:249) at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:54) at reactor.core.publisher.Mono.subscribe(Mono.java:4105) at reactor.core.publisher.MonoFilterWhen$MonoFilterWhenMain.onNext(MonoFilterWhen.java:150) at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2199) at reactor.core.publisher.MonoFilterWhen$MonoFilterWhenMain.onSubscribe(MonoFilterWhen.java:103) at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:54) at reactor.core.publisher.Mono.subscribe(Mono.java:4105) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:441) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:243) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onNext(FluxDematerialize.java:91) at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onNext(FluxDematerialize.java:38) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:243) at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:201) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.request(ScopePassingSpanSubscriber.java:76) at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.request(FluxDematerialize.java:120) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.request(ScopePassingSpanSubscriber.java:76) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.request(ScopePassingSpanSubscriber.java:76) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onSubscribe(FluxConcatMap.java:228) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onSubscribe(ScopePassingSpanSubscriber.java:69) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onSubscribe(ScopePassingSpanSubscriber.java:69) at reactor.core.publisher.FluxDematerialize$DematerializeSubscriber.onSubscribe(FluxDematerialize.java:70) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onSubscribe(ScopePassingSpanSubscriber.java:69) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:139) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:63) at reactor.core.publisher.InternalFluxOperator.subscribe(InternalFluxOperator.java:53) at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:54) at reactor.core.publisher.Mono.subscribe(Mono.java:4105) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:441) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:243) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onNext(ScopePassingSpanSubscriber.java:90) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:243) at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:201) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.request(ScopePassingSpanSubscriber.java:76) at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onSubscribe(FluxConcatMap.java:228) at org.springframework.cloud.sleuth.instrument.reactor.ScopePassingSpanSubscriber.onSubscribe(ScopePassingSpanSubscriber.java:69) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:139) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:63) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52) at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:55) at