线程一直等待或突然中断问题排查

线程一直等待或突然中断问题排查
问题描述:这两天经常收到“锁占用超时”的告警短信,第一次查找问题,是“批扣发送渠道任务”的锁占用超时,但未定位到原因,先暂时改了锁的状态,继续做业务。到第二天又发告警短信,这次是“批扣定时查询任务”的锁占用超时,意识到线程里某个地方肯定有问题,所以开始挤出时间排除问题。
问题查找:首先查找业务日志,发现定时任务线程业务日志未正常打印结束日志,首先想到的可能因素:1、业务量大导致线程执行时间过长;2、线程长时间等到;3、线程异常中断;
继续找业务运维导出这两天的线程日志,然后发现了端倪,部分线程日志如下:


Thread 160863: (state = BLOCKED)

  • sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
  • java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=215 (Compiled frame)
  • java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) @bci=78, line=2078 (Compiled frame)
  • java.util.concurrent.LinkedBlockingQueue.poll(long, java.util.concurrent.TimeUnit) @bci=62, line=467 (Compiled frame)
  • org.apache.tomcat.util.threads.TaskQueue.poll(long, java.util.concurrent.TimeUnit) @bci=3, line=85 (Compiled frame)
  • org.apache.tomcat.util.threads.TaskQueue.poll(long, java.util.concurrent.TimeUnit) @bci=3, line=31 (Compiled frame)
  • java.util.concurrent.ThreadPoolExecutor.getTask() @bci=134, line=1066 (Compiled frame)
  • java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1127 (Compiled frame)
  • java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
  • org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run() @bci=4, line=61 (Interpreted frame)
  • java.lang.Thread.run() @bci=11, line=745 (Compiled frame)

Thread 160862: (state = IN_NATIVE)

  • java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
  • java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int) @bci=8, line=116 (Compiled frame)
  • java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=170 (Compiled frame)
  • java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=141 (Compiled frame)
  • sun.security.ssl.InputRecord.readFully(java.io.InputStream, byte[], int, int) @bci=21, line=465 (Compiled frame)
  • sun.security.ssl.InputRecord.read(java.io.InputStream, java.io.OutputStream) @bci=32, line=503 (Compiled frame)
  • sun.security.ssl.SSLSocketImpl.readRecord(sun.security.ssl.InputRecord, boolean) @bci=44, line=973 (Compiled frame)
  • sun.security.ssl.SSLSocketImpl.readDataRecord(sun.security.ssl.InputRecord) @bci=15, line=930 (Compiled frame)
  • sun.security.ssl.AppInputStream.read(byte[], int, int) @bci=72, line=105 (Compiled frame)
  • org.apache.http.impl.io.SessionInputBufferImpl.streamRead(byte[], int, int) @bci=16, line=137 (Compiled frame)
  • org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer() @bci=68, line=153 (Compiled frame)
  • org.apache.http.impl.io.SessionInputBufferImpl.readLine(org.apache.http.util.CharArrayBuffer) @bci=227, line=282 (Compiled frame)
  • org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(org.apache.http.io.SessionInputBuffer) @bci=16, line=140 (Compiled frame)
  • org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(org.apache.http.io.SessionInputBuffer) @bci=2, line=57 (Compiled frame)
  • org.apache.http.impl.io.AbstractMessageParser.parse() @bci=38, line=259 (Compiled frame)
  • org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader() @bci=8, line=163 (Compiled frame)
  • org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader() @bci=4, line=167 (Compiled frame)
  • org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(org.apache.http.HttpRequest, org.apache.http.HttpClientConnection, org.apache.http.protocol.HttpContext) @bci=85, line=273 (Compiled frame)
  • org.apache.http.protocol.HttpRequestExecutor.execute(org.apache.http.HttpRequest, org.apache.http.HttpClientConnection, org.apache.http.protocol.HttpContext) @bci=83, line=125 (Compiled frame)
  • org.apache.http.impl.execchain.MainClientExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) @bci=714, line=271 (Compiled frame)
  • org.apache.http.impl.execchain.ProtocolExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) @bci=447, line=184 (Compiled frame)
  • org.apache.http.impl.execchain.RedirectExec.execute(org.apache.http.conn.routing.HttpRoute, org.apache.http.client.methods.HttpRequestWrapper, org.apache.http.client.protocol.HttpClientContext, org.apache.http.client.methods.HttpExecutionAware) @bci=85, line=110 (Compiled frame)
  • org.apache.http.impl.client.InternalHttpClient.doExecute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=168, line=184 (Compiled frame)
  • org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=52, line=82 (Compiled frame)
  • org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=39, line=107 (Compiled frame)
  • org.apache.http.impl.client.CloseableHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=2, line=55 (Compiled frame)
  • com.cly.paygw.service.communication.http.HttpClientService.messageSend(com.cly.paygw.domain.entity.PayGwContext) @bci=15, line=126 (Compiled frame)
  • com.cly.paygw.service.communication.ProtocolClientService.hand(com.cly.paygw.domain.entity.PayGwContext) @bci=7, line=33 (Compiled frame)
  • com.cly.paygw.service.communication.http.HttpClientManager.messageSend(com.cly.paygw.domain.entity.PayGwContext) @bci=100, line=104 (Compiled frame)
  • com.cly.paygw.service.message.MessageTransportImpl.submitCommunication(com.cly.paygw.domain.entity.PayGwContext) @bci=195, line=207 (Compiled frame)
  • com.cly.paygw.service.message.MessageTransportImpl.submit(com.cly.paygw.domain.entity.PayGwContext, java.util.List) @bci=43, line=142 (Compiled frame)
  • com.cly.paygw.service.message.MessageTransportImpl.submit(com.cly.paygw.domain.entity.PayGwContext, com.cly.paygw.common.share.enums.CsFlagEnum) @bci=22, line=96 (Compiled frame)
  • 。。。。。。

问题定位:线程日志“IN_NATIVE”表示线程一直在等待,根据日志HttpClientService.messageSend(高亮的部分)定位到是HTTP请求一直在等待。这下问题就清晰了,我们请求外部渠道都是使用的HttpClient工具包,而且告警的都是发送渠道的请求,又都只是其中那一个固定的直连银行BANK,该BANK请求渠道时一直在请求渠道(具体是获取连接超时?还是请求渠道超时?还是读取渠道返回超时?无法定位),该BANK的请求未设置超时时间,所以线程就卡在获取BANK连接上。

问题解决:给BANK渠道请求设置从连接池获取连接超时时间、连接超时时间、读取超时时间;然后把锁状态改为正常;重启应用。

猜你喜欢

转载自blog.51cto.com/zhengjiang/2132919