ice问题死循环问题解决

查看进程的线程资源使用情况:15047为进程PID

ps -Lp 15047  cu

top -H -p 15047

1. 首先排查哪些进程cpu占用率高。 通过命令 ps ux

[]
$ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
admin 1502 0.0 0.0 51172 1032 ? S 11:04 0:00 sshd: admin@pts/1
admin 1503 0.0 0.0 68136 1512 pts/1 Ss 11:04 0:00 -bash
admin 1555 0.0 0.0 96640 3356 pts/1 S+ 11:04 0:00 vim jstack15047.12.2
admin 1993 0.0 0.0 51172 1032 ? S 11:06 0:00 sshd: admin@pts/2
admin 1994 0.0 0.0 68136 1492 pts/2 Ss 11:06 0:00 -bash
admin 2038 0.0 0.0 65576 912 pts/2 R+ 11:06 0:00 ps ux
admin 10191 0.2 0.4 670904 23880 ? Sl 09:31 0:13 /usr/alibaba/httpd/bin/httpd -d /home/admin/run/deploy
admin 10756 0.2 0.4 670476 23092 ? Sl 09:32 0:12 /usr/alibaba/httpd/bin/httpd -d /home/admin/run/deploy
admin 14467 0.2 0.4 671700 24436 ? Sl 09:47 0:10 /usr/alibaba/httpd/bin/httpd -d /home/admin/run/deploy
admin 15037 0.0 0.0 65908 1168 ? S Nov30 0:00 /bin/sh /usr/alibaba/jboss/bin/run.sh -Djboss.server.home.dir=/home/admin/run/deploy/../.myjboss -Djboss.server.home.url=file:/home/admi
admin 15047 25.4 42.9 2915448 2252040 ? Sl Nov30 312:31 /usr/alibaba/java/bin/java -Dprogram.name=run.sh -server -Xmx2g -Xms2g -Xmn256m -XX:PermSize=196m -Xss256k -XX:+DisableExplicitGC -XX:+U
admin 15834 0.0 0.0 3840 472 ? S Nov30 0:00 /usr/alibaba/cronolog/sbin/cronolog /home/admin/out/logs/443-error_log.%w
admin 15835 0.0 0.0 3840 480 ? S Nov30 0:00 /usr/alibaba/cronolog/sbin/cronolog /home/admin/out/logs/cookie_logs/%w/cookie_log
admin 15836 0.0 0.0 58900 612 ? S Nov30 0:00 /usr/bin/logger -p local2.info
admin 15837 0.0 0.0 3840 476 ? S Nov30 0:07 /usr/alibaba/cronolog/sbin/cronolog /home/admin/out/logs/jk_logs/%w/mod_jk.log
admin 16316 0.2 0.4 669448 21740 ? Sl 09:53 0:10 /usr/alibaba/httpd/bin/httpd -d /home/admin/run/deploy
admin 27702 0.0 0.0 51320 1060 ? S 10:39 0:00 sshd: admin@pts/0
admin 27703 0.0 0.0 68136 1524 pts/0 Ss+ 10:39 0:00 -bash

2.  查看对应java进程的每个线程的CPU占用率。通过命令:ps -Lp 15047  cu

[[email protected] ~]
$ps -Lp 15047  cu
USER       PID   LWP %CPU NLWP %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
。。。。。。
admin    15047 25491 70.8  285 42.9 2915448 2252032 ?     Rl   10:29  22:35 java
admin    15047 25495 71.0  285 42.9 2915448 2252032 ?     Rl   10:29  22:34 java
admin    15047 25499  0.0  285 42.9 2915448 2252032 ?     Sl   10:29   0:00 java
admin    15047 25500  0.0  285 42.9 2915448 2252032 ?     Sl   10:29   0:00 java
admin    15047 25517  0.0  285 42.9 2915448 2252032 ?     Sl   10:30   0:00 java
admin    15047 25521  0.0  285 42.9 2915448 2252032 ?     Sl   10:30   0:00 java
admin    15047 25540 72.4  285 42.9 2915448 2252032 ?     Rl   10:30  22:31 java
admin    15047 25541  0.0  285 42.9 2915448 2252032 ?     Sl   10:30   0:00 java
admin    15047 25542  0.0  285 42.9 2915448 2252032 ?     Sl   10:30   0:00 java
admin    15047 25741 70.7  285 42.9 2915448 2252032 ?     Rl   10:31  21:33 java
admin    15047 25766  0.0  285 42.9 2915448 2252032 ?     Sl   10:31   0:00 java
admin    15047 26022  0.0  285 42.9 2915448 2252032 ?     Sl   10:31   0:00 java
admin    15047 26032 69.6  285 42.9 2915448 2252032 ?     Rl   10:32  20:38 java

3.  追踪线程内部,查看load过高原因。通过命令:jstack 15047。

以线程25495为例,现将25495转换成16进制6397。 再通过多次监控jstack日志,排查线程25495的运行轨迹。

"ActiveMQ Session Task" prio=10 tid=0x000000004a598000 nid=0x6397 runnable [0x0000000044948000]
   java.lang.Thread.State: RUNNABLE
         at Ice.ConnectionI.sendRequest(ConnectionI.java:519)
         - locked <0x00002aaac2877ff8> (a Ice.ConnectionI)
         at IceInternal.Outgoing.invoke(Outgoing.java:72)
         at AliIMInterface._WWMessageInterfaceDelM.SendNotifyMessage(_WWMessageInterfaceDelM.java:36)
         at AliIMInterface.WWMessageInterfacePrxHelper.SendNotifyMessage(WWMessageInterfacePrxHelper.java:40)
         at AliIMInterface.WWMessageInterfacePrxHelper.SendNotifyMessage(WWMessageInterfacePrxHelper.java:18)

"ActiveMQ Session Task" prio=10 tid=0x000000004a598000 nid=0x6397 runnable [0x0000000044948000]
   java.lang.Thread.State: RUNNABLE
         at IceInternal.Outgoing.invoke(Outgoing.java:72)
         at AliIMInterface._WWMessageInterfaceDelM.SendNotifyMessage(_WWMessageInterfaceDelM.java:36)
         at AliIMInterface.WWMessageInterfacePrxHelper.SendNotifyMessage(WWMessageInterfacePrxHelper.java:40)
         at AliIMInterface.WWMessageInterfacePrxHelper.SendNotifyMessage(WWMessageInterfacePrxHelper.java:18)

"ActiveMQ Session Task" prio=10 tid=0x000000004a598000 nid=0x6397 runnable [0x0000000044947000]
   java.lang.Thread.State: RUNNABLE
         at java.lang.Throwable.fillInStackTrace(Native Method)
         - locked <0x00002aaab53435e8> (a IceInternal.LocalExceptionWrapper)
         at java.lang.Throwable.<init>(Throwable.java:181)
         at java.lang.Exception.<init>(Exception.java:29)
         at IceInternal.LocalExceptionWrapper.<init>(LocalExceptionWrapper.java:16)
         at Ice.ConnectionI.sendRequest(ConnectionI.java:530)
         - locked <0x00002aaac2877ff8> (a Ice.ConnectionI)
         at IceInternal.Outgoing.invoke(Outgoing.java:72)

4. 通过jstack查看代码运行轨迹,结合已有源码,一般可以分析出死循环的地方。

猜你喜欢

转载自zl198751.iteye.com/blog/1312466