JVM2: In-depth understanding of the working mechanism of ClassLoader

1. What is ClassLoader?

ClassLoader, as the name implies, is a class loader. ClassLoader functions :

  • Responsible for loading Class into JVM
  • Review who loads each class (parent-first hierarchical loading mechanism)
  • Re-parse the Class bytecode into the object format required by the JVM

Lazy loading

JVM running does not load all the classes needed at one time, it is loaded on demand, that is, lazy loading. When the program is running, it will gradually encounter a lot of new classes that it does not recognize, and then ClassLoader will be called to load these classes. After the loading is completed, the Class object will be stored in the ClassLoader, and there is no need to reload it next time.

For example, when you call a static method of a certain class, the class must first be loaded, but the instance field of this class will not be touched. Then the class Class of the instance field does not need to be loaded temporarily, but it may be Load categories related to static fields, because static methods will access static fields. The category of the instance field may not be loaded until you instantiate the object.

Class loading timing and process

A class starts from being loaded into the virtual machine's memory until it is unloaded from the memory. Its entire life cycle includes the seven stages of loading, verification, preparation, analysis, initialization, use, and unloading. Among them, the three parts of verification, preparation and analysis are collectively referred to as linking.
Write picture description here

 

Two, the three ClassLoaders provided by Java by default

There will be multiple ClassLoaders in the JVM running instance, and different ClassLoaders will load bytecode files from different places. It can be loaded from different file directories, different jar files, or different service addresses on the network. Three important ClassLoaders are built into the JVM, namely BootstrapClassLoader, ExtensionClassLoader and AppClassLoader.

1.BootstrapClassLoader

Responsible for loading the core classes of the JVM runtime. These classes are located in the JAVA_HOME/lib/rt.jar file. Our commonly used built-in libraries java.xxx.* are all in it, such as java.util.*, java.io.*, java.nio .*, java.lang.*, etc. This ClassLoader is special, it is implemented by C code, we call it "root loader".

2.ExtensionClassLoader

Responsible for loading JVM extension classes, such as swing series, built-in js engine, xml parser, etc. These library names usually start with javax, and their jar packages are located in JAVA_HOME/lib/ext/*.jar, and there are many jar packages.

3.AppClassLoader

It is the loader directly facing our users. It will load the jar packages and directories in the path defined in the Classpath environment variable. The code we write and the third-party jar packages we use are usually loaded by it.

For those jar packages and class files provided by static file servers on the network, jdk has a built-in URLClassLoader. Users only need to pass a standardized network path to the constructor, and then use URLClassLoader to load remote class libraries. URLClassLoader can not only load remote class libraries, but also load class libraries in the local path, depending on the different address forms in the constructor. ExtensionClassLoader and AppClassLoader are both subclasses of URLClassLoader. They both load class libraries from the local file system.

AppClassLoader can be obtained by the static method getSystemClassLoader() provided by the ClassLoader class. It is what we call the "system class loader", and the class code that our users usually write is usually loaded by it. When our main method is executed, the loader of the first user class is AppClassLoader.

Three. ClassLoader transitivity

When the program is running, it encounters an unknown class. Which ClassLoader will it choose to load it? The strategy of the virtual machine is to use the ClassLoader of the caller's Class object to load the currently unknown class. What is the caller Class object? When encountering this unknown class, the virtual machine must be running a method call (static method or instance method). The class on which this method is hung, then this class is the caller Class object. Earlier we mentioned that each Class object has a classLoader property that records who loads the current class.

Because of the transitivity of ClassLoader, all delayed-loaded classes will be fully responsible for the ClassLoader that initially calls the main method, which is the AppClassLoader.

Four. Parental delegation

Earlier we mentioned that AppClassLoader is only responsible for loading class libraries under Classpath. If you encounter a system class library that is not loaded, AppClassLoader must hand over the loading of the system class library to BootstrapClassLoader and ExtensionClassLoader. This is what we often say " Appointment by both parents".

1. The architecture of ClassLoader:

 

When AppClassLoader loads an unknown class name, it does not search the Classpath immediately. It will first hand over the class name to ExtensionClassLoader to load. If ExtensionClassLoader can be loaded, then AppClassLoader does not have to bother. Otherwise it will search Classpath.

When ExtensionClassLoader loads an unknown class name, it does not immediately search the ext path. It will first hand over the class name to BootstrapClassLoader to load. If BootstrapClassLoader can be loaded, then ExtensionClassLoader does not have to be bothersome. Otherwise, it will search for the jar package in the ext path.

A cascading parent-child relationship is formed between these three ClassLoaders. Each ClassLoader is very lazy. Try to leave the work to the father. The father can't do it by himself. Each ClassLoader object will have a parent property pointing to its parent loader. When we can load this class in, it will load successfully, otherwise an exception will be thrown, ClassNotFound can’t be found

There will also be cooperation between different ClassLoaders. The cooperation between them is accomplished through the parent attribute and the parent delegation mechanism. The parent has a higher loading priority. In addition, the parent also expresses a sharing relationship. When multiple child ClassLoaders share the same parent, then the classes contained in the parent can be considered to be shared by all child ClassLoaders. This is why BootstrapClassLoader is regarded as an ancestor loader by all class loaders, and the JVM core class library should naturally be shared

2. Why do we need a mechanism such as parental delegation? Isn't my Loader simpler and more straightforward?

Mainly for safety, if you write a custom classLoader to load java.lang.string into memory. Packaged to customers. Then store the password as a String type object, then I can secretly send the password to myself, it is not safe.

Parental delegation will not have such a problem. When the custom classLoader loads the string, he becomes vigilant. He first goes to the above to check whether it is loaded, and the above loaded directly returns to you without reloading.

At this time, Kong Jing came to lift the kong, the code was written by me, can’t I just record it when he enters it, or copy the database directly, haha, I’ll give you a bang. That’s the problem of writing the code, and I will give it to you A pair of silver bracelets plus free lunch, nothing to do with JVM security

How to break parental delegation, why break it?

3.Class.forName

When we are using the jdbc driver, we often use the Class.forName method to dynamically load the driver class.

Class.forName("com.mysql.cj.jdbc.Driver");


The principle is that there is a static code block in the Driver class driven by mysql, which will be executed when the Driver class is loaded. This static code block will register the mysql driver instance to the global jdbc driver manager.

class Driver {
  static {
    try {
       java.sql.DriverManager.registerDriver(new Driver());
    } catch (SQLException E) {
       throw new RuntimeException("Can't register driver!");
    }
  }
  ...
}


The forName method also uses the ClassLoader of the caller's Class object to load the target class. But forName also provides a multi-parameter version, you can specify which ClassLoader to load

Class<?> forName(String name, boolean initialize, ClassLoader cl)


This form of forName method can break through the limitations of the built-in loader, and by using a custom class loader, it allows us to freely load class libraries from any other source. According to the transitivity of ClassLoader, other class libraries referenced by the target class library will also be loaded using a custom loader.

Five. Custom Loader

There are three important methods loadClass(), findClass() and defineClass() in ClassLoader.

The loadClass() method is the entry point to load the target class. It will first find out whether the target class has been loaded in the current ClassLoader and its parents. If it is not found, it will let the parents try to load. If the parents cannot load, it will call findClass( ) Let the custom loader itself load the target class. The findClass() method of ClassLoader needs to be covered by subclasses. Different loaders will use different logic to obtain the bytecode of the target class. After getting this bytecode, call the defineClass() method to convert the bytecode into a Class object. 

import com.mashibing.jvm.Hello;

import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;

public class T007_MSBClassLoaderWithEncription extends ClassLoader {

    public static int seed = 0B10110110;

    @Override
    protected Class<?> findClass(String name) throws ClassNotFoundException {
        File f = new File("c:/test/", name.replace('.', '/').concat(".msbclass"));

        try {
            FileInputStream fis = new FileInputStream(f);
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            int b = 0;

            while ((b=fis.read()) !=0) {
                baos.write(b ^ seed);
            }

            byte[] bytes = baos.toByteArray();
            baos.close();
            fis.close();//可以写的更加严谨

            return defineClass(name, bytes, 0, bytes.length);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return super.findClass(name); //throws ClassNotFoundException
    }

    public static void main(String[] args) throws Exception {

        encFile("com.mashibing.jvm.hello");

        ClassLoader l = new T007_MSBClassLoaderWithEncription();
        Class clazz = l.loadClass("com.mashibing.jvm.Hello");
        Hello h = (Hello)clazz.newInstance();
        h.m();

        System.out.println(l.getClass().getClassLoader());
        System.out.println(l.getParent());
    }

    private static void encFile(String name) throws Exception {
        File f = new File("c:/test/", name.replace('.', '/').concat(".class"));
        FileInputStream fis = new FileInputStream(f);
        FileOutputStream fos = new FileOutputStream(new File("c:/test/", name.replaceAll(".", "/").concat(".msbclass")));
        int b = 0;

        while((b = fis.read()) != -1) {
            fos.write(b ^ seed);
        }

        fis.close();
        fos.close();
    }


The custom class loader is not easy to break the parental delegation rules, and do not easily override the loadClass method. Otherwise, the custom loader may not be able to load the built-in core class library. When using a custom loader, you must make it clear who its parent loader is, and pass the parent loader through the constructor of the subclass. If the parent class loader is null, it means that the parent loader is the "root loader".

// ClassLoader 构造器
protected ClassLoader(String name, ClassLoader parent);


The parental delegation rule may become the third-parent delegation, the fourth-parent delegation, depending on the parent loader you use, it will always be recursively delegated to the root loader.

Class.forName vs ClassLoader.loadClass

Both of these methods can be used to load the target class. There is a small difference between them, that is, the Class.forName() method can get the Class of the native type, while the ClassLoader.loadClass() will report an error.

Class<?> x = Class.forName("[I");
System.out.println(x);

x = ClassLoader.getSystemClassLoader().loadClass("[I");
System.out.println(x);

---------------------
class [I

Exception in thread "main" java.lang.ClassNotFoundException: [I
...

Diamond dependence

There is a well-known concept in project management called "Diamond Dependency", which means that software dependencies cause two versions of the same software package to coexist and not conflict.

 

 


The maven we usually use solves the diamond dependency in this way. It will choose one from multiple conflicting versions to use. If the compatibility between the different versions is bad, the program will not be able to compile and run normally. This form of Maven is called "flat" dependency management.

 

Using ClassLoader can solve the diamond dependency problem. Different versions of software packages use different ClassLoaders to load, and the classes with the same name located in different ClassLoaders are actually different classes . Let us use URLClassLoader to try a simple example, its default parent loader is AppClassLoader

$ cat ~/source/jcl/v1/Dep.java
public class Dep {
    public void print() {
        System.out.println("v1");
    }
}

$ cat ~/source/jcl/v2/Dep.java
public class Dep {
  public void print() {
    System.out.println("v1");
  }
}

$ cat ~/source/jcl/Test.java
public class Test {
    public static void main(String[] args) throws Exception {
        String v1dir = "file:///Users/qianwp/source/jcl/v1/";
        String v2dir = "file:///Users/qianwp/source/jcl/v2/";
        URLClassLoader v1 = new URLClassLoader(new URL[]{new URL(v1dir)});
        URLClassLoader v2 = new URLClassLoader(new URL[]{new URL(v2dir)});

        Class<?> depv1Class = v1.loadClass("Dep");
        Object depv1 = depv1Class.getConstructor().newInstance();
        depv1Class.getMethod("print").invoke(depv1);

        Class<?> depv2Class = v2.loadClass("Dep");
        Object depv2 = depv2Class.getConstructor().newInstance();
        depv2Class.getMethod("print").invoke(depv2);

        System.out.println(depv1Class.equals(depv2Class));
   }
}

Before running, we need to compile the dependent libraries

$ cd ~/source/jcl/v1
$ javac Dep.java
$ cd ~/source/jcl/v2
$ javac Dep.java
$ cd ~/source/jcl
$ javac Test.java
$ java Test
v1
v2
false


In this example, if the two URLClassLoaders point to the same path, the following expression is still false, because even the same bytecode loaded with different ClassLoader classes cannot be considered the same class

depv1Class.equals(depv2Class)


We can also make two different versions of the Dep class implement the same interface, which can avoid using reflection to call the methods in the Dep class.

Class<?> depv1Class = v1.loadClass("Dep");
IPrint depv1 = (IPrint)depv1Class.getConstructor().newInstance();
depv1.print()


Although ClassLoader can solve the dependency conflict problem, it also limits the operation interface of different software packages to be dynamically called by reflection or interface. Maven does not have this restriction. It relies on the default lazy loading strategy of the virtual machine. If the custom ClassLoader is not displayed during operation, the AppClassLoader is used from beginning to end, and different versions of the same name class must be loaded with different ClassLoader , So Maven cannot solve the diamond dependency perfectly.

If you want to know if there is an open source package management tool that can solve the diamond dependency, I recommend you to learn about sofa-ark, which is an open source lightweight class isolation framework of Ant Financial.

 

Thread.contextClassLoader

If you read the source code of Thread a little bit, you will find a very special field in its instance field

class Thread {
  ...
  private ClassLoader contextClassLoader;

  public ClassLoader getContextClassLoader() {
    return contextClassLoader;
  }

  public void setContextClassLoader(ClassLoader cl) {
    this.contextClassLoader = cl;
  }
  ...
}


contextClassLoader "thread context class loader", what exactly is this?

First of all, contextClassLoader is the kind of class loader that needs to be used explicitly. If you don't use it explicitly, you will never use it anywhere. You can use the following way to show the use of it

Thread.currentThread().getContextClassLoader().loadClass(name);


This means that if you use the forName(string name) method to load the target class, it will not automatically use the contextClassLoader. Classes that are lazily loaded due to code dependencies will not be automatically loaded using contextClassLoader.

Secondly, the contextClassLoader of the thread is inherited from the parent thread. The so-called parent thread is the thread that created the current thread. The contextClassLoader of the main thread when the program starts is AppClassLoader. This means that if there is no manual setting, then the contextClassLoader of all threads is AppClassLoader.

So what exactly is this contextClassLoader used for? We have to use the principle of division and cooperation of class loaders mentioned earlier to explain its purpose.

It can share classes across threads, as long as they share the same contextClassLoader. The contextClassLoader will be automatically passed between the parent and child threads, so sharing will be automated.

If different threads use different contextClassLoader, then the classes used by different threads can be isolated.

If we divide the business, different businesses use different thread pools, the same contextClassLoader is shared within the thread pools, and different contextClassLoaders are used between thread pools, which can play a good role in isolation protection and avoid class version conflicts.

If we do not customize the contextClassLoader, all threads will use AppClassLoader by default, and all classes will be shared.

 

The contextClassLoader of thread is used in rare occasions. If the above logic is obscure and difficult to understand, don't worry too much.

JDK9 has made some changes to the structural design of the class loader after adding the module function, but the principle of the class loader is still similar. As a container for classes, it plays a role of class isolation and also needs to rely on the parent delegation mechanism To establish a cooperative relationship between different class loaders.

reference:

1. The old and difficult Java ClassLoader, it's time to fully understand it 

2. Detailed explanation of CLASSLOADER

3. Deeply understand the working mechanism of ClassLoader (jdk1.8)

Guess you like

Origin blog.csdn.net/zhaofuqiangmycomm/article/details/113825099