Startup speed increased by 10 times: in-depth analysis of Apache Dubbo static solution

Author: Hua Zhongming

Article Summary:

This article compiles the sharing of Hua Zhongming, an expert on YouZan middleware technology and Apache Dubbo PMC. The content of this article is mainly divided into five parts:

-GraalVM faces the challenges of Java applications in the cloud era

-Dubbo enjoys the technical dividends brought by AOT

-Dubbo Native Image practices and examples

-Dubbo’s principles and thoughts on integrating Native Image

-Dubbo’s future plans for Native Image technology

GraalVM faces the challenges of Java applications in the cloud era

The more notable features of the cloud computing era include:

  • Based on cloud computing infrastructure, our applications can be elastic on the cloud quickly, easily and efficiently. Especially for stateless applications, instances can be easily built based on the same image. Of course, redundant instances can also be easily shrunk to achieve elastic scaling.
  • Based on containerization technology, system resources are divided into finer parts and resource utilization becomes better.
  • Based on cloud computing development platform, application deployment is easier and application development is more agile.

So in the era of cloud computing, what are the problems with Java applications?

  • Cold starts are slower. Java application startup needs to go through processes including JVM initialization, class loading, etc., resulting in startup speed being at a disadvantage compared to other languages.
  • The application takes too long to warm up and cannot reach peak performance immediately. For example, if there is no preheating mechanism for the application and the application is sensitive to RT, there will be certain interface timeouts during release.
  • High usage of system resources such as memory and CPU. When the occupancy is too high, higher-specification instances have to be provided for Java applications, and large-specification instances are split. This will also lead to larger fragments after splitting, resulting in a waste of resources.
  • Applications built in Java are heavy and require a JDK environment for execution.

In the era of cloud computing, Java applications have disadvantages, the most obvious one being in Serverless scenarios. In addition to simplifying application development, the most important thing about Serverless is that it allows developers to elastically expand their application services in seconds. In addition to the time loss of container scheduling and new Pod creation, the time required for image downloading, application cold start, and application service warm-up are all factors that affect elastic expansion.

The picture above is Datadog's statistics on the usage of FaaS product AWS lambda in various languages. It can be seen that even though Java is still the most popular programming language, the gap between Java, Node.js and Python is still very obvious. Although the Java language is more popular, its proportion is still relatively low compared to Python and Node.js.

In the face of these problems of the Java language, GraalVM came into being. Here is an introduction to GraalVM from the official website. GraalVM can compile Java applications into independent binary files in advance. These binaries are smaller, launch 100 times faster, provide peak performance without warm-up, and use less memory and CPU than applications running on the Java Virtual Machine. Did you find that it solves all the problems just mentioned? Including cold start speed issues, the need to warm up after startup, memory usage and CPU usage are relatively high, and can even reduce the size of binary files. Reducing the size of the binary file is more conducive to reducing the size of the container image and reducing the time required to download the image.

GraalVM can be understood as a "superset" of OpenJDK. It not only includes the complete JDK release version, but also includes GraalVM Compiler, Native image, Truffle and other components, and supports multi-language mixing and other capabilities. This means that while using these new capabilities, Java applications can also run normally in the Graalvm environment in their original form.

In this introduction, we can see a key technical point, which is ahead of time, AOT technology. In traditional Java development, we all use JIT. JIT is to compile the Java source code we write into a .class file, and convert the bytecode that the JVM considers to be hot spots into machine code during runtime to speed up the program. speed. AOT converts bytecode into machine code during compilation. It will no longer have to continue to convert hot bytecode into machine code during runtime. GraalVM itself includes JIT and AOT.

Look at the picture on the left, it is the complete life cycle of a standard Java application. The first is the startup initialization of the JVM, then the startup and calling of the Java main function, then the application warm-up stage, then the stage where the application reaches stable operation, and finally the application is destroyed. At this point, the entire Java application life cycle is completed. It can be seen that in the entire life cycle of Java, if it is the traditional JIT mode, from startup to the stable stage of the application, it needs to go through JVM initialization, class loading, interpreter execution, just-in-time compilation, and GC, and even if it reaches the application In the stable stage, the JIT and interpreter are still running.

As for the AOT technology in GraalVM, it first saves the startup of the JVM, which is the time in the red zone, and secondly saves the time taken by the interpreter and compiler. So since AOT has so many advantages, whether it can replace JIT, there is still no way to replace it at present. The picture on the right shows that although AOT now has a very objective improvement in startup speed, very low resource consumption, and package size Smaller and other advantages, but JIT is still better in terms of peak throughput and maximum delay.

Dubbo enjoys the technical dividends brought by AOT

1. Multi-product form

After we finish writing the source code of the program, we need to compile and package the source code to generate a target product that can start the Dubbo application. Currently, it is still mostly in the form of the first Jar package.

The second type is a product form supported by the introduction of Spring Boot. It uses Spring Boot plug-ins to quickly generate container images without having to write Docker files yourself and manually package Jar into the image. Convenient containerized deployment of Dubbo applications.

The third type is a new product form added after Dubbo supports GraalVM: Native Executable.

The packaging commands of these three products are also different. The traditional Jar package form is the most common mvn clean package, the second one executes SpringBoot plug-ins, and the third one executes native-related plug-ins during the compilation phase. . This new product Native Executable can start applications without installing a JDK environment.

2. The startup time is greatly reduced

The Dubbo framework has been developing for nearly ten years, and the capabilities it provides are becoming more and more powerful. At the same time, the startup speed of Dubbo applications has always been a headache for us, but with the integration of GraalVM Native Image, this problem also gives us hope.

Two pictures are shown above. The first picture describes the comparison of application startup time that only provides one Dubbo service in the Native Executable and Jar package scenarios respectively. The data here are all calculated under the 4c16g macOS system, and 10 sets of data were run for each scenario. The startup time here also takes the JVM startup time into consideration. For example, the time consumption in the Jar package scenario is calculated from the time it takes to start the Java application with java -jar until the application is ready to start.

As can be seen from the comparison chart, the startup time of Native Executable is 12.4 times lower than that of Jar package, which means that the startup speed is increased by 12.4 times. The second picture is a comparison of the startup time of applications with only one Dubbo service Consumer. From the comparison picture, it can be seen that Native Executable is 11 times faster than the startup of Jar package. It can be seen that after integrating GraalVM Native Image, Dubbo applications can truly start in milliseconds.

3. Reach peak performance immediately after startup

Judging from the statistical chart, when the Consumer and Provider are both Native Executable, the first call takes 6 times less time than when both are Jar packages. This enables Dubbo applications to reach peak performance immediately after startup. The startup time and performance peak upon startup also give Dubbo the opportunity to expand its application scope in Serverless or Faas technology scenarios.

4. Memory consumption is greatly reduced

Here we also give the memory comparison statistics of Provider and Consumer applications in Native Executable and Jar Package scenarios. Take the first picture, you can see that after Jar Package is started, the memory required is more than 200 megabytes. Native Executable only occupies about 60M, and the memory consumption is reduced by 3.5 times. In today's cost reduction environment, it allows us to reduce memory usage for Java applications and save system resources, which is full of more room for imagination.

In addition to the above four technical bonuses, Dubbo is also constantly improving related content while integrating GraalVM Native Image technology.

First, the ease of use has been enhanced. In the RPC framework, the definition of service interfaces is determined by the business, and these service interfaces also need to be configured with corresponding reachability metadata to implement RPC calls. Dubbo uses annotations and The XML configuration method implements automatic generation of reachability metadata, eliminating the need for application developers to configure the Reachability Metadata configuration themselves.

The second point is the enhanced maintainability. Dubbo developers and contributors no longer need to consciously maintain the Reachability Metadata and Adaptive source code required by the Dubbo framework. For example, if Dubbo adds a new feature, Dubbo's contributor adds a new SPI interface, and he does not need to consider generating Adaptive source code in a native image scenario.

The third point is support for multiple platforms, including the familiar Linux, MacOS, and Windows operating systems. The last one is that various capabilities in Dubbo can be used normally in native image scenarios, such as Dubbo protocol and Triple protocol in Dubbo3.x.

Dubbo Native Image practices and examples

First, you need to install Dubbo Native Image, which I won’t introduce in detail here. You can download it according to the official documentation.

Then install the plug-in. You can see that there are three plug-ins that need to be installed in the picture above, but only one is related to Dubbo, which is the Dubbo Maven Plugin.

The first picture is the maven plug-in provided by the new version of Dubbo to adapt to GraalVM Native Image. The second picture is SpringBoot's maven plug-in, which is used to handle Spring Boot's AOT processing logic. If you use Dubbo's API access method, there is no need to configure SpringBoot's maven plug-in. The third picture is the maven plug-in officially provided by GraalVM, which integrates the logic of packaging and compiling Native Executable.

The next step is to configure the required dependencies. To integrate Dubbo with Native, you need to configure two additional dependencies, namely dubbo-native and dubbo-config-spring6. Among them, dubbo-config-spring6 is also the same as the previous SpringBoot plug-in. If xml is not used Or access via annotations, there is no need to configure this dependency. It is mainly used to adapt to Spring6, because currently Dubbo is still compatible with lower versions of Spring and Springboot, and is also compatible with JDK8 and JDK11. Springboot3.0 and spring6 and above have adjusted the lowest JDK version to JDK17, so it is currently used This module is adapted and compatible with higher versions of Spring.

dubbo-native integrates all dubbo's source code and logic generated by Reachability Metadata. To ensure that Dubbo applications can be compiled and packaged into Native Executable normally, and run normally in Native Executable form.

Except for the above configuration, there are no changes in other usage methods. Developers can package an application as Native Executable. The compilation commands are listed here. Compilation requires the GraalVM environment and the native-image tool, so check whether the environment is normal before compiling. After compilation, a Native Executable will appear in the root directory of the project. Start it directly to complete the startup of the application. The complete sample code is under dubbo-samples, and interested friends can also experience it on their own.

The bottom is a complete code example. If you are interested, you can try compiling and packaging to see the effect of execution.

The principles and thinking behind Dubbo integrating Native Image

Dubbo officially released Dubbo 3.0 in June 21. Members of the Dubbo community initially investigated GraalVM Native Image technology and initially supported Native Image in one of the iterative versions of 3.0, but at that time it was only presented as an experimental Demo, and There is not much consideration to how users in the production environment use it, the maintenance costs of Dubbo contributors, etc. There are four serious problems with the support of the first version:

1. Adaptive Source code needs to be maintained in the Dubbo core warehouse

At that time, a tool for generating Adaptive Source code should have been provided, allowing users to generate Adaptive Source code for the SPI interface through the tool. And it will be generated directly under the application project instead of the target directory, which will have a certain intrusion into the developer's application source code.

2. Dubbo needs to maintain the full amount of Reachability Metadata

If Dubbo's core warehouse, for example, Dubbo's registration center supports Zookeeper and Nacos, if the application only uses Zookeeper and does not use Nacos, but because Dubbo maintains a full amount of Reachability Metadata, these will also be included when packaging. All are packaged into Native Executable, causing the size of the executable file to expand, and the compilation and packaging time also becomes longer. In addition, if Dubbo contributors add any new features, they also need to manually maintain the Reachability Metadata in the corresponding configuration. If they forget, this feature will not be usable under the native image, and may even lead to packaging. Compilation failed.

3. Only supports API access, but does not support XML and annotation access.

We know that most users of Dubbo still use XML and annotation access methods, which results in the feeling that most users are unable to use this capability.

4. The content of dubbo-native-plugin is too focused

At that time, a new dubbo-native-plugin maven plug-in was added, which was used for native-related compilation and packaging processing. The content of this plug-in was too focused, and in the future, Dubbo users would need to access every new Dubbo function. A new plug-in, it is not conducive to subsequent iteration and feature enhancement of Dubbo maven plug-in. And it will bring trouble to Dubbo users in using and maintaining Maven plug-ins.

With the support of this experimental Native Image technology, there will be no more update iterations in the subsequent 3.1 version.

With the release of Spring 6 and Spring boot 3.0 in November 2022, and the appearance of GraalVM Native Image as a very bright feature of the new version, the Dubbo community also realized that Dubbo's integration of GraalVM Native Image should require a new step. stage.

We will find that manually maintaining this content is very labor-intensive, and it also puts greater pressure on code review. It is necessary to identify in advance that these changes may have Reachability Metadata. After iterating through several versions, we found that omissions during the coding and code review stages are common. In the 3.2 version released in April 2023, we rethought and reconstructed support for GraalVM Native Image technology, solving the problems of the previous experimental version:

  1. The compilation phase automatically identifies the required SPI interface and automatically generates Adaptive Source code. And according to the standard Java compiled product directory structure, these Source Codes are generated in the target directory. And this part of the Source code is no longer maintained in the dubbo core warehouse, it is dynamically generated during the compilation phase.

  2. Supports automatic generation of Reachability Metadata required by the Dubbo framework during the compilation phase. There is also no need for developers to maintain this part of configuration information.

  3. In addition, dubbo-maven-plugin has been added, hoping to replace dubbo-native-plugin. We believe that dubbo-maven-plugin should be the only maven plugin output by the dubbo framework to developers, and all new features that need to be built with the help of maven plugin should be migrated to this maven plugin. This can reduce the mental burden on developers.

There is another important iteration in Dubbo version 3.2, which is that it supports and is compatible with Spring6 and Spring Boot3. This also lays the foundation for Dubbo to provide native image capabilities in xml and annotation access gestures. In dubbo 3.3, which will be released at the end of this year, native images will be supported in access methods that support xml and annotations.

One word was mentioned many times in the previous introduction, and that is Reachability Metadata. Dubbo's integration of GraalVM Native Image mainly revolves around the processing of Reachability Metadata.

So what exactly is it? AOT also has its own limitations, that is, it follows the principle of closed world assumption. That is to say, it needs to rely on being able to "see all bytecodes" to work correctly, which will cause AOT to be unable to support dynamic language functions, such as JNI, Java reflection, dynamic proxy, ClassPath resource acquisition and other capabilities.

During the Java development process, these dynamic capabilities of Java are used in various scenarios, and they have already become very skilled coding methods for Java developers. So GraalVM also takes this situation into consideration. This type of problem is solved through Reachability Metadata without breaking the principle of "closed time assumption". Since all bytecodes and resources need to be determined in the compiler, let developers determine this metadata information during the coding phase.

This picture lists the five main types of Reachability Metadata currently used: JNI Metadata, Resource Metadata, Dynamic Proxy Metadata, Serialization Metadata, and Reflection Metadata. There is also a sixth type of Predefined Classes Metadata on the official website, because it requires the configuration of a complete class bytecode hash value, and it is more suitable for use with the Tracing agent. The issue that developers are more concerned about is how to obtain and provide these Reachability Metadata, so as to achieve the purpose of successful construction and execution of Native Executable.

First, GraalVM provides Tracing Agent to assist developers in collecting the corresponding Reachability Metadata at runtime. However, metadata collected through Tracing Agent does not guarantee complete collection because Tracing Agent only tracks and collects executed code, and code paths that are not covered by program input will not be collected. The GraalVM official website also recommends that you need to manually check the metadata after collection.

The second is that GraalVM provides Reachability Metadata Repository. Java has developed over the years and has produced many component libraries. Business developers can easily use these components to complete business functions. There are not many scenarios where pure business logic applies Java dynamic language features. Instead, these component libraries use these features more frequently, so GraalVM provides the Reachability Metadata Repository to absorb Reachability Metadata from different components.

For example, Netty is a frequently used network communication framework, and Netty uses reflection internally. In this warehouse, you can find Netty's Reachability Metadata. So how can the Reachability Metadata in this warehouse be used by applications? The answer is to use the native-maven-plugin officially provided by GraalVM.

Now that we understand Reachability Metadata, what are the Reachability Metadata related to Dubbo? This picture lists some metadata scenarios corresponding to Dubbo. The most used one by Dubbo is reflection.

The first scenario is service. Dubbo is an RPC framework. Defining service interfaces is the most basic requirement. At the same time, operations such as obtaining interface methods through reflection are very frequent during runtime. The internal and external service interfaces are distinguished here because there are some built-in services in the Dubbo framework, such as MetricService, MetadataService, etc.

The second SPI Extension class and Adaptive class, we know that Dubbo's powerful and flexible scalability benefits from its own set of SPI mechanisms, in which reflection is required for the implementation class defined as the SPI interface and the Adaptive class. Of course, the SPI extension implementation classes here also include classes implemented by the business itself, such as the most familiar and widely used Filter in the Dubbo execution chain. Even if the business has its own implementation class, Dubbo aot can scan and load it.

The third category is that some related classes need to be loaded in advance when starting multiple instances.

The fourth category is the core configuration category of Dubbo. Friends who have experience in using it through API access should be aware of it, such as ServiceConfig, RegistryConfig, etc.

Finally, there are some other reflection behaviors, including the components that Dubbo depends on. There are some reflection behaviors, such as reflection in ZK. The second type is Resource Metadata. Dubbo is related to the following four resource files. Everyone is familiar with these three resource files under META-INF. When we use Dubbo's SPI, we must configure the configuration of the extension implementation. To ensure that the SPI implementation is loaded.

The fourth security resource file contains the black and white lists required for serialization. It can prevent some serialization RCE vulnerabilities, which is new in the Dubbo3 version to enhance the security of the service. Serialization related Metadata. The main things related to Dubbo are the return types and request parameter types of internal and external service methods. As an RPC framework, Dubbo's basic capability is to implement RPC calls. During the call process, both requests and responses need to be serialized and deserialized.

The fourth is the metadata to the dynamic proxy. In Dubbo, it is mainly necessary to generate a dynamic proxy class in the Consumer to proxy the remote service interface. Mask out some details of network transmission, serialization and other behaviors, so that the caller can use it like a local method call. The last JNI currently does not have relevant metadata information in Dubbo.

This is the summary and processing strategy of Dubbo-related Reachability Metadata. We have divided it into four categories.

1. Regular content: It is content that has certain rules and needs to be generated, such as the generation of Adaptive Source Code just mentioned.

2. Deterministic resources and behaviors: What is determined are the resources Dubbo needs in the Native scenario, such as SPI configuration files, etc.

3. Uncertain resources and behaviors: These are resources or behaviors added by the business based on Dubbo's capabilities. This part is called uncertain resources and behaviors, such as business-customized SPI extension implementations and defined services.

4. Integrated and dependent components: For example, the metadata information related to Zookeeper just mentioned.

Both Spring and Dubbo have their own AOT processing logic, but the processing between them is somewhat different. This is the processing logic of Spring Aot. You can see that starting from the source code compilation, Spring will start the application directly from the main function, and will The source code generated by Spring beans is generated, and the corresponding Reachability Metadata is generated. For Spring, all Metadata can be scanned during the startup and scanning process.

The following is the Bean Source code generated through Spring AOT, when we simply use a Spring Service annotation. The implementation class of DemoService is defined as a Bean in Spring, and after AOT processing, the BeanDefinition provider class on the right that is the same as DemoService will be generated. Used to obtain relevant information when loading Beans.

Dubbo's AOT will start a scanning process after the source code starts compiling to complete the Reachability Metadata and corresponding Source code related to Dubbo just listed. It can be seen that after we replace Spring's Service annotation with Dubbo's DubboService annotation, we will also get a BeanDefinition provider class that is the same as DemoService, but the content inside is provided by Dubbo. Including splicing interface and other parameters. Here are the results of Spring AOT processing and Dubbo AOT processing. You can see that the Reachability Metadata required by GraalVM and their respective Source codes are generated under spring-aot and dubbo-aot respectively.

The picture above shows the content of a product provided by Spring itself and the content of Dubbo's AOT product. You can see Dubbo below is some source code of Adaptive. Finally, Native will read all the configuration here when executing.

This is the boundary between Dubbo and Spring AOT in the subsequent AOT evolution. First, in terms of API access, Dubbo does not depend on Spring, so it can complete all content, including the required Adaptive Source code and Reachability Metadata. Secondly, because XML and annotations both depend on Spring, the beans in Dubbo will be implemented based on the capabilities of Spring AOT, including ServiceBean, ReferenceBean and Spring Bean-related content in the Dubbo framework. In addition, Spring will also generate the required Reachability Metadata.

Dubbo’s future plans for Native Image technology

1. Improve developer experience & development efficiency

The first point is to improve developer experience and development efficiency. Dubbo provides CTL, scaffolding, and IDEA plug-ins after 3.0. Dubbo Native Image is still under construction, and Native Image will be added later. In addition, the construction of some Dubbo Native documents will also be gradually carried out.

2. Performance optimization and improvement

The second point is performance optimization and improvement. In addition to the capabilities provided by GraalVM, we can also add some class-related reachability configurations, which will make the final binary package smaller and compile time shorter.

###v3. Cover more components

The third point is to cover more components. Because many components are not supported yet, our main idea now is to complete the scalability support of Dubbo main warehouse, and then provide corresponding support for the extension of dubbo-spi-exntensions. In addition, we will push the reachability metadata required by the kernel to GraalVM's reachability metadata warehouse, so that business development can normally use the metadata information required by the Dubbo kernel.

Finally, our idea is to cover more components: we will give priority to GraalVM official support, but because the GraalVM release cycle is uncontrollable, the advancement time may be longer, so we will also give priority to supporting elements for some common and necessary components. Data allows Dubbo users to enjoy the technical dividends brought by Native Image in advance.

Qt 6.6 is officially released. The pop-up window on the lottery page of Gome App insults its founder . Ubuntu 23.10 is officially released. You might as well take advantage of Friday to upgrade! RISC-V: not controlled by any single company or country. Ubuntu 23.10 release episode: ISO image was urgently "recalled" due to containing hate speech. Russian companies produce computers and servers based on Loongson processors. ChromeOS is a Linux distribution using Google Desktop Environment 23-year - old PhD student fixes 22-year-old "ghost bug" in Firefox TiDB 7.4 released: officially compatible with MySQL 8.0 Microsoft launches Windows Terminal Canary version
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3874284/blog/10117931