Robust 2.0: An upgraded version of the hotfix framework that supports Android R8

In 2016, we gave a detailed introduction to the technical principle of Robust, the Meituan Android hot update solution. In recent years, Google has launched a new code optimization and obfuscation tool R8. The production of Android hotfix patches relies on the comparison between the secondary build package and the online package. It is necessary to adapt and transform Proguard to R8 in advance. This article shares the Robust adaptation R8 and some ideas and experience in optimization and improvement, I hope it can help or inspire you.

1. Background
2. Main challenges
3 Solutions
- 3.1 Introduction to the overall plan
- 3.2 Problems and solutions
4 Summary
5 Authors

1. Background

Meituan Robust is a real-time hotfix framework based on method instrumentation. The main advantage is that it takes effect in real time and is compatible with all Android versions with zero hooks. In 2016, we introduced the technical principles in detail in the article " Android Hot Update Solution Robust ", mainly by inserting IF branches for each method to dynamically control the code logic, and then realize hot repair. Its core mainly has two parts: one is code insertion, and the other is automatic patching.

With the widespread use of Javassist and ASM tools in the code instrumentation part, the overall solution is relatively mature, and the iterative improvement is mainly aimed at optimizing the size and performance of the instrumentation code;
The automatic patching part has been iterative in the actual use process. Like the mainstream hot-fixing solutions in the industry, the automatic patch tool is made after Proguard obfuscation. Since Proguard will optimize and obfuscate the code, make a patch after Proguard Can reduce the complexity of patch generation.

In recent years, Google has launched a new code optimization and obfuscation tool R8 to replace the third-party code optimization and obfuscation tool Proguard. After years of functional iteration and defect improvement, R8 can basically replace Proguard in terms of functions, and is even better in results ( The Android bytecode generated by optimization is smaller). Google has forced the use of R8 in the new version of the build tool. Many well-known apps at home and abroad have completed R8 adaptation and launched. For example, WeChat Android officially switched from Proguard to R8 this year (by upgrading the Android build toolchain). The production of Android hot fixes depends on the comparison between the secondary construction package and the online package, and it is necessary to adapt and transform Proguard to R8 in advance. This article shares some thoughts and ideas of Robust, the platform technology department of Meituan, in adapting to R8 and optimizing and improving it. experience.

2. Main challenges

The general production process of Android hotfix patches: firstly, based on the online code, logical repair is performed and packaged again, then the patch generation tool automatically compares the difference between the repair package and the online package, and finally produces a lightweight patch package. Therefore, in the process of patch making, two main problems need to be solved:

For the code that has not changed, how to ensure that it is consistent with the online package during the second package;
For the repaired code this time, how to accurately identify and generate the patched code after compiling, optimizing, and obfuscating.

To solve these two problems, you need to have a certain understanding of the Android compilation and build process, and figure out the cause of the problem. Figure 1 below is the build process of an Android project from source code to APK (Android application installation package) (the oval corresponds to the build toolchain):

Figure 1 The build process from source code to APK

Some tools in the above picture have been replaced by new tools, but the overall process has not changed much. Compared with this picture, let's analyze several links that have an impact on patch production/secondary packaging:

Resource compiler (aapt/aapt2) : The resource compilation link will generate an R.java file (recording the resource id, which is convenient for reference in the code). Generally, in order to solve too many R fields and reduce the package size, large Android projects will be built during the build process. will inline the resource id directly into the call site (which happens between javac and proguard). If the resource id is inconsistent between the two packages before and after, it will affect the result of diff identification.
Code compiler (javac) : After Java code is compiled into bytecode by javac, in addition to some simple optimizations (such as constant expression folding, conditional compilation), there are also some basic desugaring (grammatical features before Java 8) The operation will generate some new classes/methods/instructions, such as anonymous inner classes will be compiled into a OuterClass$1.classnew class called and access$200bridge methods named like . If the change involves internal classes and generic types, the numbers after the secondary packaging $ may be out of order with the online package.
Code optimizer (ProGuard/R8) : At present, the third-party open source tool ProGuard is mainly used (Google plans to launch R8 to replace Proguard). Through 30+ optional optimization items, the previously generated Java bytecode can be further compressed, optimized, and confused. Make the Android installation package smaller, more secure, and run more efficiently:

Compression : Through static analysis and deletion of unused class/field/method, that is, the class/field/method existing in the source code may not necessarily exist in the online package.
Optimization : Optimize the bytecode through a series of optimization algorithms or templates to make the build product smaller and run more efficiently/safely. The optimization methods include merging classes/interfaces, inline short methods, cutting method parameters, and deleting unreachable branches. , outlining code (new in R8), removing code without side effects (such as Log.d()), modifying method/variable visibility, etc. Compared with the source code, the optimized bytecode may reduce the number of classes/fields/methods, change field/method access modifiers, change method signatures, and reduce code instructions. In addition, the optimization results of the secondary construction may be different from those of the online package. Inconsistent.
Obfuscation : By renaming the name of class/field/method to a meaningless short character, it increases the difficulty of reverse engineering and reduces the package size. The second packaging needs to be consistent with the obfuscation of the online package, otherwise the patch will crash due to an abnormal call after the patch is loaded.

Desugar tool (not shown in the figure, the old version uses the third-party plug-in Lambda/Desugar, and the new version uses the built-in R8): Since the low-version Android devices do not support Java 8+ syntax features, this step requires the Lambda expression, Syntax features of higher versions such as method references, default and static interface methods are implemented in lower versions. Among them, the Lambda expression will be compiled into an inner class, and there will be problems similar to (2).

So far, we have a certain understanding of the causes of the two problems mentioned at the beginning of this chapter. Compared with the source code, the bytecode generated by the Android build process has "structural" changes in the dimensions of class/field/method/code , such as the class/field/method called in the repair code does not exist in the online package (shrink, merge, or inline), or the field/method that is accessible in the source code but inaccessible in the patch (the modifier is marked private), the method parameter list does not match (parameters that have not been used before are trimmed) and so on.

These optimization items provided by Proguard are optional. Generally, in large-scale Android projects, after comprehensive consideration of multiple factors such as actual benefits, stability, and construction time, some optimization items will be disabled, but not completely disabled. Therefore, there will be some differences between the second package and the online package, and the accuracy of patch production will be affected by this. In the past, the Robust patch production process often encountered such problems. The accuracy of recognition can be improved through special character detection, whitelist, etc., but the implementation solution is not automatic enough. The Robust patch production process is as follows:

Figure 2 Robust patch production process

If the build toolchain (Android Gradle Plugin) of the Android project is upgraded to the official newer version, the two links of Proguard (Java bytecode optimization and) + Dex (Android bytecode generation) in the above figure will be merged into one , and is replaced by R8:

Figure 3 Two build streams

The upgrade and change of the above-mentioned build toolchain brings 2 new problems to Robust patch production:

There is no opportune time to make a patch. If the JAR-based change identification scheme is changed to DEX or Smali-based, it is equivalent to replacing the patch production scheme. The former needs to be based on the DEX file format and instructions, while the latter needs to deal with a large number of registers, which is more error-prone, and the compatibility and stability are not good enough. .
Proguard can disable some optimization options, but the R8 official document clearly states that it does not support disabling some optimizations, which will cause more differences than before and interfere with the change recognition.

3 Solutions

3.1 Introduction to the overall plan

The idea of making a patch based on R8 is to compare the Java bytecode before the change identification is optimized and confused, and at the same time combine the online APK structural analysis (class/field/method) to correct the call of the patch code to the online code. Get patch.jar, and finally use R8 to confuse patch.jar (applymapping), desugar, generate Dex, and package to get patch.apk. The complete process is shown in the figure below:

Figure 4 complete process

| 3.2 Problems and solutions

3.2.1 R8 and Proguard optimization comparison

Some ProGuard configuration items become invalid after switching to R8. The R8 official document explains this: With the continuous improvement of R8, maintaining standard optimization behavior will help the Android Studio team to easily troubleshoot and solve the problems you may encounter. any problem.

Figure 5 R8 official explanation

Up to now, many problems caused by R8 optimization can still be found on the Internet, and there is no public document introducing the use and disabling instructions of optimization rules. You can only compare and analyze the similarities and differences between the optimization rules of the two by reading the official ProGuard documentation and the R8 source code. Through the R8 source code, it is found that some rules can be disabled through hidden build parameters, reflection, or directly modifying the R8 source code. Although the optimization rules of R8 are not one-to-one correspondence with Proguard, they can basically achieve the same optimization effect as when using Proguard before.

com.android.tools.r8.utils.InternalOptions.enableEnumUnboxing
com.android.tools.r8.utils.InternalOptions.enableVerticalClassMerging
com.android.tools.r8.utils.InternalOptions.enableClassInlining
com.android.tools.r8.utils.InternalOptions.inlinerOptions().enableInlining//方法内联
com.android.tools.r8.utils.InternalOptions.outline.enabled)//方法外联
com.android.tools.r8.utils.InternalOptions.testing.disableMarkingMethodsFinal
com.android.tools.r8.utils.InternalOptions.testing.disableMarkingClassesFinal

Some rules can be turned off/on by building the parameter -Dcom.android.tools.r8.disableMarkingMethodsFinal, and other unsupported parameters can also be simply modified as follows:

Figure 6 Transformation method

What if you don't want to disable these rules in a project? In the previous patch production process, the accuracy of change identification may be affected. In the new patch production process, the identification of changes is not affected, but after the identification, it is necessary to check whether the external calls in the patch are legal in combination with the online APK. Further careful analysis of these optimization rules can be divided into four categories: class, field, method, and code. Among them, the methods that have a greater impact on Robust patch production are method inlining, parameter removal, and being marked as private, which will be introduced in the following sections. corresponding processing method.

3.2.2 Identification of "true" and "false" changes

If there is an anonymous inner class in the source code, javac will compile and generate a class named {external class name}${number number}, and the subsequent number numbers are calculated sequentially according to the order in which the anonymous inner class appears in the outer class from.

When an anonymous inner class is added/deleted in the repair code, it cannot be compared only by the class name (so in some hot repair framework usage documents with classes as the smallest granularity, you will see something like "Adding anonymous inner classes is not supported" , "Only support adding anonymous internal classes at the end of external classes" and other descriptions), at this time Robust will fuzzy process the following numbers, and further find out the real changed anonymous internal classes through bytecode comparison, and identify which ones are Real changes and which ones are fake changes.

In addition, if private field/method access is involved between nested classes, the javac compiler will generate bridging methods with the naming rules access$100 and access$200, and the numbers behind access$ (related to the order of appearance) will also affect the changes Identification (finally R8 will change the modifier to public and delete the bridge method), the solution here is similar to the above method of identifying real internal class changes.

There is another situation worth noting. Larger Android projects usually use componentization. Each component participates in App construction and packaging in the form of AAR. In the process of component binary release (source code -> AAR), R8 can be used for unpacking. Sugar (For Android) to get Java 7 bytecode, a typical example is Lambda expression, after desugar processing to generate {external class}$$ExternalSyntheticLambda{number} (even if there are multiple numbers such as $2$1) and other classes , and generate a static method whose naming rule is lambda${method name}${number} in the external class (different desugarers have different naming rules), the processing method of the patch generation tool is similar to the above.

The finally identified code changes include not only the methods with source code changes or new methods/classes (if any), but also the related bytecodes generated by desugaring the javac compiler, as well as the component binary releases The bytecode generated by desugaring by R8 during the process.

3.2.3 Inline identification and processing

Through the introduction in the second chapter, you can see that the online code will be optimized and obfuscated after being compiled by javac. Therefore, the code changes identified by the above bytecode comparison (class/method dimension ), if it involves calls to online codes, you also need to ensure that these Field/Method calls are "legal" to avoid runtime crashes.

Among the many optimization items, the main concern is whether the class/field/method exists and is accessible. If it does not exist in the online package (removed or inlined during the last build process), the patch generation phase needs to be added as a new class/method; if the online package cannot be accessed externally (the last build process public is changed to private), the patch generation phase needs to change the direct call to reflective call; if the method signature in the online package changes (the parameters were cut during the last build process), the call needs to be modified or added as a new method.

Since Dex files are essentially different from standard class files in structural design (Dex tools integrate all class files into one or several Dex files, the purpose is that each class can share data, making the file structure more compact), The two cannot be directly compared. The specific detection method is to first analyze the external references in the patch class through ASM, then use the dexlib2 library to parse the Dex in the APK, extract the class/field/method structural information (requires de-obfuscation processing), and finally analyze and process the compatibility .

R8 outlining optimization is an advanced optimization technology with very strict conditions for effectiveness and needs to be used reasonably in a suitable environment. R8 outlining optimization will extract the same code from multiple methods into a new method to reduce the code size. But it will add a method call overhead. If the code you want to fix is an outlinked method, just treat the outlinked method as a new method to fix it.

3.2.4 Confusion problem and optimization

Unlike the previous ApplyMapping for the entire project in the secondary packaging process, here only a few classes that have changed need to be ApplyMapping, and the probability of confusion and inconsistency will be much smaller. During the Robust patch making process, only the changed classes are passed to Proguard for secondary obfuscation, and the mapping file of the online package is automatically applied during this process:

-applymapping {线上包的 mapping.txt}

However, in some special cases, such as deleting an old method and adding a new method at the same time, or a defect in ApplyMapping, the confusion in the patch and the confusion on the line will still appear to be inconsistent, so when generating the patch After that, it needs to be compared and verified according to the online APK. If errors and confusions are found, it will be further decompiled into Smali and then replaced with characters.

3.2.5 Other aspects of optimization

(1) super instruction

In Android development, the invoke-super instruction is often used to rewrite a system method while retaining some logic in the parent method. Take the onCreate method of the Activity class as an example:

public class MyActivity extends Activity {
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState); // 调用父类的 onCreate 方法
    }
}

Among them, super.onCreate(savedInstanceState) is a typical super call. After Dex compilation, the invoke-super instruction can be seen at the Smali syntax level. But in the patch class, it is impossible to write something like myActivity.super.onCreate(savedInstanceState), because super can only be used in the original class; even if it is forcibly written using bytecode technology, it will prompt java.lang.NoSuchMethodError at runtime, because the patch is not Subclass of the target method.

In order to simulate the implementation of the invoke-super instruction of the JVM, an auxiliary class that inherits the parent class of the repaired class needs to be generated for each patch class (to solve the problem that the super call can only be used in the target subclass), and the patch .onCreate translates to a call to origin.super.onCreate of the original class. Robust was processed at the Smali level in the early days. Dex needs to be converted to Smali. After processing, Smali is converted to Dex. It is more convenient to use ASM bytecode to directly process Class bytecode without converting to Smali. The key codes for ASM bytecode conversion of this auxiliary class are as follows:

public class SuperMethodVisitor extends MethodVisitor {
    ...
    @Override
    public void visitMethodInsn(int opcode, String owner, String name, String desc, boolean itf) {
        if (opcode == Opcodes.INVOKEVIRTUAL) {
            // 将 INVOKEVIRTUAL 指令替换成 INVOKESPECIAL
            super.visitMethodInsn(Opcodes.INVOKESPECIAL, owner, name, desc, itf);
        } else {
            super.visitMethodInsn(opcode, owner, name, desc, itf);
        }
    }

    @Override
    public void visitVarInsn(int opcode, int var) {
        if (opcode == Opcodes.ALOAD && var == 0) {
            //保证super调用在原始类
            mv.visitVarInsn(opcode, 1);
            return;
        }
        mv.visitVarInsn(opcode, var);
    }
    ...
}

The above method is implemented by using an auxiliary class, and another improved method is introduced below.

In the JNI layer, the common CallObjectMethod function is suitable for calling virtual methods, that is, when calling methods depends on the class hierarchy of objects, similar to Java's invoke-virtual; the corresponding function is CallNonvirtualObjectMethod, which is suitable for non-virtual method calls. That is, the object to be called is the object of the specified class, no matter whether the class is inherited or overridden, that is to say, the super method of the parent class can be called through CallNonvirtualObjectMethod.

The invoke-super instruction in the Java language can be realized through the combination of CallNonvirtualObjectMethod and GetMethodID. The key code is as follows:

jmethodID methodID = env->GetMethodID(parentClass, "superMethodName", "()V");
jvalue args[] = {};
jobject result = env->CallNonvirtualObjectMethod(parentObj, parentClass, methodID, args);

(2) Insertion and repair of <init> function

The <init> function of some subclasses will explicitly call the constructor super() of the parent class, and super() must be the first statement in the <init> function of the subclass, otherwise the compilation will fail. Therefore, for the <init> function, Robust instrumentation cannot be performed on the first line, and it needs to be inserted after the super() constructor of the parent class.

So how does the <init> function fix it? After the modification of the <init> function of the original class, it is also an <init> function in the patch class. Here, the <init> function needs to be copied into a normal function, and the Robust instrumentation of the original class is associated with the normal function.

Copying a constructor and turning it into a method requires a few caveats:

The original class function name <init> needs to be changed to a common method name to avoid conflict with the <init> function of the patch class.
The original class <init> function needs to be kept consistent if it has method parameters.
The return type of the new method of the patch class is void.
If the <init> function of the original class calls this() or super() constructors, they need to be deleted in the patch new method.

(3) Insertion and repair of <clinit> function

The <clinit> function is a special static constructor generated by the compiler, which is used to initialize static variables and complex static expressions in the class. If static variables or code blocks are defined in a class, the compiler generates a <clinit> function for these static variables and code blocks. The <clinit> function will only be executed once, and the virtual machine ensures that only one thread can execute the <clinit> method, ensuring thread-safe access to shared class-level variables.

Therefore, when instrumenting and repairing the <clinit> function, special attention should be paid to the execution timing of the <clinit> method:

When a class is instantiated, if the class's <clinit> method has not been executed, it will be executed to initialize the class's static variables and complex static expressions.
When obtaining a static member of the class through reflection, if the <clinit> method of the class has not been executed, this method will be executed to initialize the static variables and complex static expressions of the class.
If the class is inherited by a subclass, and the <clinit> method is also defined in the subclass, when creating a subclass instance, the <clinit> method of the parent class will be executed first, and then the <clinit> method of the subclass will be executed.

According to the above analysis of the execution timing of the <clinit> function, the static member variables of the class cannot be accessed during instrumentation (the clinit function has already been executed when the static variable is accessed, and cannot be effectively repaired), so it is impossible to use the Robust conventional instrumentation method (for Class To insert a static interface Field), you need to use an auxiliary class ClintPatchProxy to implement the insertion logic.

/**
 * 线上 MainActiviy clinit 插桩
 */
public class MainActivity {
    static {
        String classLongName = "com.app.MainActivity";
        if (ClintPatchProxy.isSupport(classLongName)) {
            ClintPatchProxy.accessDispatch(classLongName);
        } else {
            // MainActitiy Clinit origin code
        }

When the clinit function is repaired, just set the ClintPatchProxy jump interface implementation in the static code block of the patch entry class. The original MainActivity clinit code will no longer be executed, and the MainActivityPatch clinit code (corresponding to the new clinit code of MainActivity) will be executed instead. .

(4) Repair new classes/new member variables/new methods

The method based on method insertion naturally supports new classes; for new fields and methods, there are two cases: static Field and Method can be wrapped with a new class; new non-static fields can use an auxiliary Class to maintain the mapping relationship between the this object and the Field. The code that originally used this.newFieldName in the patch can be converted to FieldHelper.get(this).getNewFieldName() through the bytecode tool.

4 Summary

Looking back at the Robust hotfix production process, it is mainly the ingenious combination of the build compilation process and bytecode editing technology. By analyzing the Android application packaging process, Java language compilation and optimization process, various problems that may be encountered in the patch production process will be answered, and then a hotfix patch can be generated by analyzing and processing bytecode tools.

Of course, this involves a lot of detail processing, just one article is not enough to cover all kinds of details, and it needs to be combined with actual projects to have a more comprehensive understanding.

5 Authors

Chang Qiang, engineer of Meituan Platform-App Technology Department.

---------- END ----------

recommended reading

| Android's exploration and practice of so volume optimization

| Android static code scanning efficiency optimization and practice

| The principle analysis of Android compatibility with Java 8 grammatical features