Apache Log4j是一个基于Java的日志记录工具,用起来非常方便,但是Log4j 1.x如果使用不慎,会引起死锁问题,进行导致整个网站的宕机

示例如下:

log4j 配置

<? xml  version = "1.0"  encoding = "UTF-8" ?>
<!DOCTYPE log4j:configuration PUBLIC "-//APACHE//DTD LOG4J 1.2//EN" "log4j.dtd">
< log4j:configuration  xmlns:log4j = "http://jakarta.apache.org/log4j/" >
     < appender  name = "console"  class = "org.apache.log4j.ConsoleAppender" >
         < param  name = "Target"  value = "System.out" />
         < layout  class = "org.apache.log4j.PatternLayout" >
             < param  name = "ConversionPattern"  value = "%d %t %-5p (%c:%L) - %m%n" />
         </ layout >
     </ appender >
     < logger  name = "a"  additivity = "false" >
         < level  value = "info" />
         < appender-ref  ref = "console" />
     </ logger >
     < logger  name = "b"  additivity = "false" >
         < level  value = "info" />
         < appender-ref  ref = "console" />
     </ logger >
     < root >
         < appender-ref  ref = "console" />
     </ root >
</ log4j:configuration >

使用logger a进行日志输出的java文件

package  a;
 
import  org.apache.log4j.Logger;
 
public  class  A {
 
     private  static  final  Logger logger = Logger.getLogger(A. class );
 
     public  A() {
         try  {
             logger.info( "[A.A()] class A" );
         catch  (Exception ex) {
             logger.error( "[A.A()]" , ex);
         }
     }
}

使用logger b进行日志输出的java文件

package  b;
import  org.apache.log4j.Logger;
import   a.*;
public  class  B {
     private  static  final  Logger logger = Logger.getLogger(B. class .getName());
 
     public  static  void  main(String[] args) {
         new  Thread( new  Runnable() {
             public  void  run() {
                 logger.info( new  B());
             }
         }).start();
         new  A();
     }
 
     public  String toString() {
         logger.info( "[B.toString]" );
         new  A();
         return  "[B.toString] finish" ;
     }
}

运行上面的程序,相当大的概率,程序会卡在这里不动,通过jstack命令,我们可以发现程序死锁了

jstack 3876
2016-12-22 16:54:38
Full  thread  dump Java HotSpot(TM) 64-Bit Server VM (25.65-b01 mixed mode):
 
"Attach Listener"  #12 daemon prio=9 os_prio=31 tid=0x00007fb932944000 nid=0x3b0b waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"Thread-0"  #11 prio=5 os_prio=31 tid=0x00007fb9341c1800 nid=0x5903 waiting  for  monitor entry [0x000070000175b000]
    java.lang.Thread.State: BLOCKED (on object monitor)
     at org.apache.log4j.Category.callAppenders(Category.java:205)
     - waiting to lock  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at a.A.(A.java:11)
     at b.B.toString(B.java:18)
     at org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37)
     at org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80)
     at org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:368)
     at org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:402)
     at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
     at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
     at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
     at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
     at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
     - locked  (a org.apache.log4j.ConsoleAppender)
     at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
     at org.apache.log4j.Category.callAppenders(Category.java:206)
     - locked  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at b.B$1.run(B.java:10)
     at java.lang.Thread.run(Thread.java:745)
 
"Monitor Ctrl-Break"  #10 daemon prio=5 os_prio=31 tid=0x00007fb9358de800 nid=0x5703 runnable [0x0000700001658000]
    java.lang.Thread.State: RUNNABLE
     at java.net.PlainSocketImpl.socketAccept(Native Method)
     at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
     at java.net.ServerSocket.implAccept(ServerSocket.java:545)
     at java.net.ServerSocket.accept(ServerSocket.java:513)
     at com.intellij.rt.execution.application.AppMain$1.run(AppMain.java:90)
     at java.lang.Thread.run(Thread.java:745)
 
"Service Thread"  #9 daemon prio=9 os_prio=31 tid=0x00007fb933002800 nid=0x5303 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"C1 CompilerThread3"  #8 daemon prio=9 os_prio=31 tid=0x00007fb936020800 nid=0x5103 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"C2 CompilerThread2"  #7 daemon prio=9 os_prio=31 tid=0x00007fb935007800 nid=0x4f03 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"C2 CompilerThread1"  #6 daemon prio=9 os_prio=31 tid=0x00007fb935006800 nid=0x4d03 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"C2 CompilerThread0"  #5 daemon prio=9 os_prio=31 tid=0x00007fb933801000 nid=0x4b03 waiting on condition [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"Signal Dispatcher"  #4 daemon prio=9 os_prio=31 tid=0x00007fb935837800 nid=0x3a13 runnable [0x0000000000000000]
    java.lang.Thread.State: RUNNABLE
 
"Finalizer"  #3 daemon prio=8 os_prio=31 tid=0x00007fb933802000 nid=0x3803 in Object.wait() [0x0000700000d3a000]
    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on  (a java.lang.ref.ReferenceQueue$Lock)
     at java.lang.ref.ReferenceQueue. remove (ReferenceQueue.java:143)
     - locked  (a java.lang.ref.ReferenceQueue$Lock)
     at java.lang.ref.ReferenceQueue. remove (ReferenceQueue.java:164)
     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
 
"Reference Handler"  #2 daemon prio=10 os_prio=31 tid=0x00007fb936009000 nid=0x3603 in Object.wait() [0x0000700000c37000]
    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on  (a java.lang.ref.Reference$Lock)
     at java.lang.Object.wait(Object.java:502)
     at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
     - locked  (a java.lang.ref.Reference$Lock)
 
"main"  #1 prio=5 os_prio=31 tid=0x00007fb935801800 nid=0x1703 waiting  for  monitor entry [0x0000700000218000]
    java.lang.Thread.State: BLOCKED (on object monitor)
     at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231)
     - waiting to lock  (a org.apache.log4j.ConsoleAppender)
     at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
     at org.apache.log4j.Category.callAppenders(Category.java:206)
     - locked  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at a.A.(A.java:11)
     at b.B.main(B.java:13)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:497)
     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
 
"VM Thread"  os_prio=31 tid=0x00007fb936006800 nid=0x3403 runnable 
 
"GC task thread#0 (ParallelGC)"  os_prio=31 tid=0x00007fb93580a800 nid=0x2403 runnable 
 
"GC task thread#1 (ParallelGC)"  os_prio=31 tid=0x00007fb932804800 nid=0x2603 runnable 
 
"GC task thread#2 (ParallelGC)"  os_prio=31 tid=0x00007fb93580b000 nid=0x2803 runnable 
 
"GC task thread#3 (ParallelGC)"  os_prio=31 tid=0x00007fb936003000 nid=0x2a03 runnable 
 
"GC task thread#4 (ParallelGC)"  os_prio=31 tid=0x00007fb93580b800 nid=0x2c03 runnable 
 
"GC task thread#5 (ParallelGC)"  os_prio=31 tid=0x00007fb936003800 nid=0x2e03 runnable 
 
"GC task thread#6 (ParallelGC)"  os_prio=31 tid=0x00007fb936004000 nid=0x3003 runnable 
 
"GC task thread#7 (ParallelGC)"  os_prio=31 tid=0x00007fb936004800 nid=0x3203 runnable 
 
"VM Periodic Task Thread"  os_prio=31 tid=0x00007fb935838000 nid=0x5503 waiting on condition 
 
JNI global references: 60
 
 
Found one Java-level deadlock:
=============================
"Thread-0" :
   waiting to lock monitor 0x00007fb93600edf8 (object 0x000000076b2a21d0, a org.apache.log4j.Logger),
   which is held by  "main"
"main" :
   waiting to lock monitor 0x00007fb93600d8a8 (object 0x000000076b311008, a org.apache.log4j.ConsoleAppender),
   which is held by  "Thread-0"
 
Java stack information  for  the threads listed above:
===================================================
"Thread-0" :
     at org.apache.log4j.Category.callAppenders(Category.java:205)
     - waiting to lock  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at a.A.(A.java:11)
     at b.B.toString(B.java:18)
     at org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37)
     at org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80)
     at org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:368)
     at org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:402)
     at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
     at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
     at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
     at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
     at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
     - locked  (a org.apache.log4j.ConsoleAppender)
     at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
     at org.apache.log4j.Category.callAppenders(Category.java:206)
     - locked  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at b.B$1.run(B.java:10)
     at java.lang.Thread.run(Thread.java:745)
"main" :
     at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231)
     - waiting to lock  (a org.apache.log4j.ConsoleAppender)
     at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
     at org.apache.log4j.Category.callAppenders(Category.java:206)
     - locked  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at a.A.(A.java:11)
     at b.B.main(B.java:13)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:497)
     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
 
Found 1 deadlock.

jstack命令的最后,给我们一个很重要的提示,就是程序发生了死锁,那么接下来,就是要分析程序为什么会产生死锁, jstack命令也给了相应的提示信息,如下所示

"Thread-0" :
   waiting to lock monitor  0x00007fb93600edf8  (object  0x000000076b2a21d0 , a org.apache.log4j.Logger),
   which is held by  "main"
"main" :
   waiting to lock monitor  0x00007fb93600d8a8  (object  0x000000076b311008 , a org.apache.log4j.ConsoleAppender),
   which is held by  "Thread-0"

Thread-0尝试去lock一个Logger, org.apache.log4j.Logger这个类继承于org.apache.log4j.category,而调用的info方法在category这个类中,由org.apache.log4j.category中这段源码可以发现,Thread-0实际上是去尝试lock一个category

public   void  callAppenders(LoggingEvent event) {
     int  writes =  0 ;
     for (Category c =  this ; c !=  null ; c=c.parent) {
        // Protected against simultaneous call to addAppender, removeAppender,...
           synchronized (c) {
             if (c.aai !=  null ) {
                  writes += c.aai.appendLoopOnAppenders(event);
             }
             if (!c.additive) {
                  break ;
             }
           }
     }
     if (writes ==  0 ) {
       repository.emitNoAppenderWarning( this );
     }
}

从上面的代码中可以看出Thread-0实际上在等待lock一个category,再结合上面的log4j的配置文件,很容易的发现,Thread-0会使用两个logger文件,即name=a与name=b的logger文件均会使用,当调用logger.info(new B())的时候,实际上最后会被自动的转化为对new B().toString()方法的调用,在这个方法里面即输出了A的日志,又输出了B的日志,再结合jstack命令下面的描述,可以发现再lock [name=a的logger时,由于已经被其他线程占用,因此 Thread-0 block]

"Thread-0":
     at org.apache.log4j.Category.callAppenders(Category.java:205)
     - waiting to lock  (a org.apache.log4j.Logger)
     at org.apache.log4j.Category.forcedLog(Category.java:391)
     at org.apache.log4j.Category.info(Category.java:666)
     at a.A.(A.java:11)
     at b.B.toString(B.java:18)
     at org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:37)

综上所述,Thread-0线程拥有了logger name=b的category,ConsoleAppender ,且正在尝试去获取logger name=a的category

jstack命令显示main线程正在wait一个ConsoleAppender,由源码可以发现ConsoleAppender继承于AppenderSkeleton,再结合以下输出

     at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231)
     - waiting to lock  (a org.apache.log4j.ConsoleAppender)
     at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
     at org.apache.log4j.Category.callAppenders(Category.java:206)

可以发现,main线程在执行AppenderSkeleton.java中doAppend方法时发生了wait,看源码如下所示

public
synchronized 
void  doAppend(LoggingEvent event) {
   if (closed) {
     LogLog.error( "Attempted to append to closed appender named [" +name+ "]." );
     return ;
   }
   
   if (!isAsSevereAsThreshold(event.getLevel())) {
     return ;
   }
 
   Filter f =  this .headFilter;
   
   FILTER_LOOP:
   while (f !=  null ) {
     switch (f.decide(event)) {
     case  Filter.DENY:  return ;
     case  Filter.ACCEPT:  break  FILTER_LOOP;
     case  Filter.NEUTRAL: f = f.getNext();
     }
   }
   
   this .append(event);    
}

看到了synchronized,也就明白了,main线程无法获取到ConsoleAppender的对象锁,因为此对象锁正在被Thread-0占用,综上所述,main线程拥有了logger name=a的category,正在尝试去获取ConsoleAppender的对象锁,去进行日志输出,因此两线程死锁了

Log4j的日志输出流程



 

由上面的流程图可以看出,在一个日志的输出过程中,会尝试进行两次lock,在高并发的情况下,两次lock就很容易造成死锁,上面产生死锁的可能执行序列如下:

1.Thread-0 lock b的category

2.Thread-0 lock ConsoleAppender 

3.main线程 lock a的category 

4.Thread-0 尝试去获取 a的category,因为要进行日志a的输出了, block

5.main线程尝试去lock ConsoleAppender,由于被Thread-0占用, bloc

最终死锁了!

找到死锁发生的原因,那如何解决这个问题:

取决于log4j配置文件的合理性

log4j的上面加上slf4j,将要输出的日志提前格式化成一个字符串,对于上面的例子,也就是将logger.info(new A()) 中new A().toString()提前调用,没有slf4j时的调用序列是

lock b->lock consoleappender->lock a->lock consoleappender 是一个步骤完成的

加入slf4j后的设用序列就变成了

lock a -> lock consoleappender 完成了第一阶段new A()

lock b -> lock consoleappender 完成了第二阶段 logger.info(string)

加入slf4j后可以发现的是日志的输出分成两步执行了,因为执行logger.info时,会首先对要输出的内容进行格式化,将其格式化成一个string, so,不会发生死锁了。

转载原文:http://www.aiprograming.com/b/pengpeng/23