Do you really know the lock? Can you use locks? (Deadly Synchronized underlying implementation)

Insert picture description here
There are many things about multi-threading, and it is also very interesting, so my recent focus may have been in the direction of multi-threading. I wonder if you like it?

1. Optimistic lock & pessimistic lock

1.1 Scenarios
We normally use Synchronized in the following scenarios:

Modified instance method, lock the current instance object this

public class Synchronized {
    
    
    public synchronized void husband(){
    
    

    }
}

Modify static methods, lock the Class object of the current class

public class Synchronized {
    
    
    public void husband(){
    
    
        synchronized(Synchronized.class){
    
    

        }
    }
}

Modify the code block, specify a locked object, and lock the object

public class Synchronized {
    
    
    public void husband(){
    
    
        synchronized(new test()){
    
    

        }
    }
}

In fact, it is the lock method, lock code block, and lock object. How do they implement locking?

Before that, let me talk to you about the composition of our Java objects.
In the JVM, objects are divided into three areas in memory:

Object header
- Mark Word (marked field): The HashCode, generation age and lock flag information of the object are stored by default. It will reuse its own storage space according to the state of the object, that is to say, the data stored in the Mark Word will change with the change of the lock flag during operation.
- Klass Point (type pointer): The pointer of the object to its class metadata. The virtual machine uses this pointer to determine which class instance the object is.
Instance data
- This part mainly stores the data information of the class and the information of the parent class.
Fill it
- Since the virtual machine requires that the start address of the object must be an integer multiple of 8 bytes, the padding data does not have to exist, just for byte alignment.
  Tip: I wonder if you have ever been asked how many bytes an empty object occupies? It is 8 bytes, because of the alignment and padding. Less than 8 bytes will be filled automatically for us.

Insert picture description here
We often talk about order, visibility, and atomicity. How does synchronized do it?

1.2 Orderliness

I have already said in the Volatile chapter that the CPU will reorder our programs in order to optimize our code.

as-if-serial

Regardless of how the compiler and CPU reorder it, it must be ensured that the result of the program is correct in a single-threaded case, and that it cannot be reordered even if it is dependent on data.

For example:

int a = 1;
int b = a;

The two paragraphs cannot be reordered. The value of b depends on the value of a. If a is not assigned first, it will be empty.

1.3 Visibility

Also in the Volatile chapter, I introduced the memory structure of modern computers and the JMM (Java Memory Model). Here I need to explain that JMM does not actually exist, but a set of specifications. This specification describes many java programs. Variables (thread shared variables) access rules, as well as the low-level details of storing variables in memory and reading variables from memory in the JVM, the Java memory model is the visibility, order, and atomicity of shared data Rules and guarantees.

Everyone is interested, and remember to understand the components of the computer, cpu, memory, multi-level cache, etc., will help better understand the reasons why java does this.

Insert picture description here

1.3 Atomicity

In fact, it is very simple to ensure atomicity. It is enough to ensure that only one thread can get the lock at the same time and can enter the code block.
These are the features that we often use when using locks. What features does synchronized itself have?

1.4 Reentrancy

There is a counter when synchronized locks the object. It will record the number of times the thread acquires the lock. After the corresponding code block is executed, the counter will be -1. Until the counter is cleared, the lock is released.
What are the benefits of reentry?

It can avoid some deadlock situations, and it also allows us to better encapsulate our code.

1.5 Uninterruptibility

Uninterruptible means that after a thread acquires a lock, another thread is in a blocked or waiting state. The former is not released, and the latter will always block or wait and cannot be interrupted.

It is worth mentioning that the tryLock method of Lock can be interrupted.

2. Low-level implementation

The implementation here is very simple. I wrote a simple class with a lock method and a lock code block. Let's decompile the bytecode file and it will be fine.

First look at the test class I wrote:

/**
 *@Description TODO Synchronize
 *@Author: ZhangSan_Plus
 *@Date: 2020/6/15 15:03
 **/
public class Synchronized {
    
    
    public synchronized void husband(){
    
    
        synchronized(new Volatile()){
    
    

        }
    }
}

After the compilation is complete, we go to the corresponding directory and execute the javap -c xxx.class command to view the decompiled files:

Classfile /Users/aobing/IdeaProjects/Thanos/laogong/target/classes/juc/Synchronized.class
  Last modified 2020-5-17; size 375 bytes
  MD5 checksum 4f5451a229e80c0a6045b29987383d1a
  Compiled from "Synchronized.java"
public class juc.Synchronized
  minor version: 0
  major version: 49
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #3.#14         // java/lang/Object."<init>":()V
   #2 = Class              #15            // juc/Synchronized
   #3 = Class              #16            // java/lang/Object
   #4 = Utf8               <init>
   #5 = Utf8               ()V
   #6 = Utf8               Code
   #7 = Utf8               LineNumberTable
   #8 = Utf8               LocalVariableTable
   #9 = Utf8               this
  #10 = Utf8               Ljuc/Synchronized;
  #11 = Utf8               husband
  #12 = Utf8               SourceFile
  #13 = Utf8               Synchronized.java
  #14 = NameAndType        #4:#5          // "<init>":()V
  #15 = Utf8               juc/Synchronized
  #16 = Utf8               java/lang/Object
{
    
    
  public juc.Synchronized();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 8: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Ljuc/Synchronized;

  public synchronized void husband();
    descriptor: ()V
    flags: ACC_PUBLIC, ACC_SYNCHRONIZED  // 这里
    Code:
      stack=2, locals=3, args_size=1
         0: ldc           #2                  // class juc/Synchronized
         2: dup
         3: astore_1
         4: monitorenter   // 这里
         5: aload_1
         6: monitorexit    // 这里
         7: goto          15
        10: astore_2
        11: aload_1
        12: monitorexit    // 这里
        13: aload_2
        14: athrow
        15: return
      Exception table:
         from    to  target type
             5     7    10   any
            10    13    10   any
      LineNumberTable:
        line 10: 0
        line 12: 5
        line 13: 15
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      16     0  this   Ljuc/Synchronized;
}
SourceFile: "Synchronized.java"

2.1 Synchronization code

You can see a few places I marked. I mentioned the object header at the beginning. It will be associated with a monitor object.

You can see a few places I marked. I mentioned the object header at the beginning. It will be associated with a monitor object.
If you are already the owner of this monitor, and you enter again, the entry number will be +1.
In the same way, when he executes monitorexit, the corresponding entry number is -1, until it is 0, it can be held by other threads.

All mutual exclusion, in fact, here is to see if you can obtain the ownership of the monitor. Once you become the owner, you are the winner.

2.2 Synchronization method

I don't know if you noticed a special flag in the method, ACC_SYNCHRONIZED.

When synchronizing the method, once the method is executed, it will first determine whether there is a flag bit, and then ACC_SYNCHRONIZED will implicitly call the two instructions just now: monitorenter and monitorexit.

So in the final analysis, it is still a competition for monitor objects.

2.3 monitor

I have talked about this object so many times, do you think it is a nihilistic thing, in fact it is not, the monitor monitor source code is written in C++, in the ObjectMonitor.hpp file of the virtual machine.

I looked at the source code, his data structure looks like this:

 ObjectMonitor() {
    
    
    _header       = NULL;
    _count        = 0;
    _waiters      = 0,
    _recursions   = 0;  // 线程重入次数
    _object       = NULL;  // 存储Monitor对象
    _owner        = NULL;  // 持有当前线程的owner
    _WaitSet      = NULL;  // wait状态的线程列表
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;  // 单向列表
    FreeNext      = NULL ;
    _EntryList    = NULL ;  // 处于等待锁状态block状态的线程列表
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
    _previous_owner_tid = 0;
  }

I also put this piece of C++ code in my open source project, so you can check it yourself.

The source code at the bottom of synchronized is the introduction of ObjectMonitor. If you are interested, you can take a look. Anyway, what I said above, as well as concepts that you often hear, can be found here.
Insert picture description here
Everyone said that the familiar lock upgrade process is actually in the source code, calling different implementations to acquire the lock, and if it fails, the higher-level implementation is called, and the upgrade is completed.

3. Heavyweight lock

When you look at the ObjectMonitor source code, you will find Atomic::cmpxchg_ptr, Atomic::inc_ptr and other kernel functions. The corresponding threads are park() and upark().

This operation involves the conversion between user mode and kernel mode. This switch is very resource intensive, so know why there is an operation like spin lock. It is reasonable to say that operations like an endless loop are more resource-intensive, right? Actually it is not. You will know it after you find out.

3.1 What are user mode and kernel mode?

The architecture of the Linux system should have been in contact with the university, which is divided into user space (application program activity space) and kernel.

All of our programs are running in user space, and entering the user running state is (user mode), but many operations may involve kernel running. If I/O is used, we will enter the kernel running state (kernel mode).
Insert picture description here
This process is very complicated and involves the transfer of many values. I will briefly summarize the process:

User mode puts some data in registers, or creates a corresponding stack, indicating that the service provided by the operating system is required.
User mode executes system calls (system calls are the smallest functional unit of the operating system).
The CPU switches to the kernel mode and jumps to the corresponding memory location to execute instructions.
The system calls the processor to read the data parameters we previously put in the memory and execute the request of the program.
After the call is complete, the operating system resets the CPU to the user mode and returns the result, and executes the next instruction.

So everyone has always said that before 1.6 it was a heavyweight lock, yes, but the essence of its weight is determined by the process of ObjectMonitor calling and the complicated operating mechanism of the Linux kernel. It consumes a lot of system resources, so the efficiency is low.
There are also two situations where the kernel mode and the user mode switch: abnormal events and interrupts of peripheral devices can also be understood.

4 Optimize lock upgrade

It has been said that the efficiency is low, and the official knows it, so they have made an upgrade. If you read the source code I just mentioned, you will know that their upgrade is actually very simple, just a few more function calls. , But the design is still very clever.

Let's take a look at the lock upgrade process after upgrade:
Insert picture description here
Simple version:

Upgrade direction:

Insert picture description here

Tip: Remember that this upgrade process is irreversible, and at the end I will explain its impact, involving usage scenarios.

After watching his upgrade, let's talk about how to do each step.

4.1 Bias lock

As I mentioned before, the object header is composed of Mark Word and Klass pointer. The lock contention is the contention of the Monitor object pointed to by the object head. Once a thread holds the object, the flag bit is changed to 1, and it enters the bias mode. , And the ID of this thread will be recorded in the Mark Word of the object.

This process uses CAS optimistic lock operation. Every time the same thread enters, the virtual machine does not perform any synchronization operations. Just add the flag bit to +1. If different threads come, CAS will fail, which means the acquisition The lock failed.

The bias lock is turned on by default after 1.6, and turned off in 1.5. The parameter that needs to be turned on manually is xx:-UseBiasedLocking=false.

Insert picture description here
What if the bias lock is closed, or multiple threads compete for the bias lock?

4.2 Lightweight lock

It is still related to Mark Work. If the object is lock-free, jvm will create a space called Lock Record in the stack frame of the current thread to store the Mark Word copy of the lock object, and then set the Lock Record The owner in points to the current object.

The JVM will then use CAS to try to update the original Mark Word of the object to the pointer of the Lock Record. If it succeeds, it means that the lock is successful, changes the lock flag bit, and performs related synchronization operations.

If it fails, it will judge whether the Mark Word of the current object points to the stack frame of the current thread. If yes, it means that the current thread already holds the lock of this object, otherwise it means it is held by other threads. Continue to upgrade and modify the lock The state of the lock, and then the waiting thread is also blocked.
Insert picture description here

4.3 Spin lock

I did not mention above that switching between user mode and kernel mode of the Linux system consumes resources, but it is actually the process of waiting and awakening threads. How can we reduce this consumption?

Spin, the ones that came over now continue to spin to prevent the thread from being suspended. Once resources can be obtained, it will directly try to succeed until the threshold is exceeded. The default size of the spin lock is 10 times. -XX: PreBlockSpin can be modified.

If the spin fails, upgrade to a heavyweight lock, just like the 1.5, waiting to be awakened.

Insert picture description here

So far, I basically talked about the before and after concepts of synchronized, everyone digest it.

Reference: "High Concurrency Programming", "Dark Horse Programmer's Handout", "In-Depth Understanding of JVM Virtual Machine"

5. Should I use synchronized or Lock?

Let's take a look at their differences:

Synchronized is a keyword, it is the bottom layer of the JVM level that will help us, and Lock is an interface, which is a rich API at the JDK level.
Synchronized will automatically release the lock, and Lock must manually release the lock.
Synchronized is not interruptible, Lock can be interrupted or not.
Through Lock, you can know whether the thread has obtained the lock, but synchronized cannot.
Synchronized can lock methods and code blocks, while Lock can only lock code blocks.
Lock can use read locks to improve multi-threaded read efficiency.
Synchronized is a non-fair lock, ReentrantLock can control whether it is a fair lock.

One of the two is at the JDK level and the other is at the JVM level. I think the biggest difference is actually whether we need rich APIs, and there is another scenario for us.

For example, I am Didi. I have a rush hour in the morning, and my code uses a lot of synchronized. What's the problem? The lock upgrade process is irreversible. After the peak, we are still heavyweight locks. Is the efficiency greatly reduced? Is it good for you to use Lock at this time?

Scenarios must be considered. I tell you which is good now is nonsense, because without business, all technical discussions are of no value.

Insert picture description here