In-depth explanation of Compose Compiler (1) Kotlin Compiler & KCP

insert image description here

foreword

The syntax of Compose is concise and the code efficiency is very high. This is mainly due to a series of compile-time magic of Compose Compiler, which helps developers generate a lot of boilerplate code. But compile-time instrumentation also hinders our understanding of the operating principle of Compose. If we want to really understand Compose, we must first understand its Compiler. This series of articles will take you to uncover the mystery of Compose Compiler.

Compose is a Kotlin Only framework, so the essence of Compose Compiler is a KCP (Kotlin Compiler Plugin). Before studying the source code of Compose Compiler, it is necessary to lay down some basic knowledge of Kotlin Compiler and KCP

Kotlin compilation process

Kotlin is a cross-platform language, and Kotlin Compiler can compile Kt source code into object code for multiple platforms: JS, JVM bytecode, and even LLVM machine code. But no matter what kind of target code is compiled into, the compilation process can be divided into two stages:

  • Frontend (compiler front end) : analyze the source code to get AST (abstract syntax tree) and symbol table, and complete static inspection
  • Backend (compiler backend) : Generate platform object code based on AST and other front-end products

In short: the front end is responsible for parsing and checking the source code, and the back end is responsible for the generation of the target code

As above, take Kotlin/JVM as an example:

  • In Frontend processing, the Kt source file generates PSI and the corresponding BindingContext through lexical, grammatical and semantic analysis (Lexer&Paser).
  • In Backend processing, first generate JVM bytecode based on PSI and BindingContext, and then binarize bytecode through ASM to generate class file

The processing flow of Frontend in the compilation process of different target platforms is the same, but different target codes are generated in Backend

K1 Compiler: PSI & BindingContext

The full name of PSI is Program Structure Interface , which can be understood as the AST dedicated to JetBrains (there are some extensions above the standard AST). PSI can be used for syntax static checking during compilation, and PSI is also used for static checking of IntelliJ series IDEs. It is because of it that we can prompt syntax errors in real time in the process of writing code. So PSI helps to reuse static checking logic in compilation and writing stages. We have the opportunity to use PSI when developing IDE Plugin or writing Detekt static checking use cases.

  • PSI: https://plugins.jetbrains.com/docs/intellij/psi-elements.html
  • Detekt: https://github.com/detekt/detekt

In the IDE, you can see the PSI corresponding to the source code in real time through the PsiViewer plug-in. Take the following code as an example:

fun main() {
    
    
    println("Hello, World!")
}

The above picture is the output result in PsiViewer, you can see that it reflects the following tree structure:

The nodes of the PSI tree are the parsed syntax elements of the source code, such as a special symbol, a string, etc., which are all PsiElements. PsiElement still lacks context-based semantic information, such as a KtFunction, its parameter information, modifier information, etc., which requires the assistance of BindingContext.

BindingContext is equivalent to the symbol table supporting PSI . PsiElement gets the corresponding Descriptor (descriptor) after semantic analysis and records it in BindingContext. BindingContext can quickly index to the Descriptor corresponding to PSI node. Descriptor contains the semantic information we need, for example, FunctionDescriptor can obtain TypeParameters, isInline and other information.

The BindingContext structure is similar to one Map<Type, Map<key, Descriptor>, the key of the first Map represents the PSI node type, the key of the second Map is the PsiElement instance, and the Value is its corresponding Descriptor. KtFunction is the key to get the corresponding FunctionDescriptor; KtCallExpression gets the corresponding ResolvedCall, which includes the FunctionDescriptor of the calling method and the incoming Parameters.

K2 Compiler: FIR & IR

From the above introduction, we know that Kotlin Compiler's Fortend products are PSI and BindingContext, and Backend will directly output the target code based on them. Since Backend couples the target code generation logic, it is difficult to reuse some compile-time processing and optimization logic on multiple platforms. For example, we all know that the suspend function will generate additional code during compilation, and we hope that these codegen logic can be reused. For this reason, Kotlin has developed a new generation of compiler, named K2.

K2: https://blog.jetbrains.com/zh-hans/kotlin/2021/10/the-road-to-the-k2-compiler/

The biggest feature of the K2 compiler is the introduction of IR (Intermediate Representation, intermediate expression) . IR is an intermediate product connecting the front and back ends, and it has nothing to do with the platform. Compile-time optimizations such as suspend can be implemented for IR and reused across platforms.

In K2, a new IR-based Backend is used to replace the old PSI and BindingContext-based Backend. Starting from Kotlin 1.5, Kotlin/JVM enables the new IR Backend by default, and starting from Kotlin 1.6, the Kotin/JS IR Backend becomes the standard configuration. The figure below is the compilation process of introducing IR Backend.

IR is also a tree data structure, but its abstract expression is more "low-level" and closer to the CPU architecture. IrElement has a variety of semantic information, such as the visibility, modality and returnType of FUN, etc. It is not necessary to obtain these information by querying BindingContext like PsiElement.

For the previous Hello World example, the corresponding IR tree is printed as follows:

FUN name:main visibility:public modality:FINAL <> () returnType:kotlin.Unit
    BLOCK_BODY
        CALL 'public final fun println (message: kotlin.Any?): kotlin.Unit [inline] declared in kotlin.io.ConsoleKt' type=kotlin.Unit origin=null
            message: CONST String type=kotlin.String value="Hello, World!"

In addition to the new IR Backend, K2 also updated Frontend, the main change is to use FIR (Frontend IR) instead of PSI and BindingContext. Since 1.7.0 we can use the new front end of K2.

To sum up, it can be seen that the main change of K2 relative to K1 is the introduction of FIR Frontend and IR Backend .

IR can be converted from FIR, they are all tree structures, so what is the difference between the two? It can be distinguished from the following three aspects:

FIR AND
different goals FIR integrates PSI and BindingContext information to find descriptor information more quickly. Its primary goal is to improve the performance of front-end static analysis and inspection Performance is not a consideration of IR. The starting point of its data structure is not to improve the speed of back-end compilation, but to serve the sharing of compilation logic between different back-ends and reduce the cost of supporting new language features on different platforms.
different structure FIR is still an AST, but some symbol information is enhanced to speed up static analysis IR is not only an AST, but also provides richer context-based semantic information. For example, I can know whether a variable in a code block is a temporary variable or a member variable, while FIR is difficult to do.
different ability Although FIR can also handle some simple desugaring and code generation work, it still serves the front end as a whole and cannot greatly modify AST IR has a rich Godegen API, which can add/remove/update the tree structure more flexibly, and realize the magical modification requirements at any compilation time

KCP(Kotlin Compiler Plugin)

KCP allows us to implement various compile-time magical changes by adding extension points during the above Kotlin compilation process. Many of Kotlin
's grammatical sugars are implemented based on KCP, such as the well-known No-arg, All-open, kotlinx-serialization and so on.

KCP can also perform annotation processing at compile time like KAPT, but it has advantages over KATP:

  1. KCP is performed during the Kotlin compilation process, while KAPT needs to add an additional pre-compilation link before the official compilation, so the performance of KCP is better. KSP (Kotlin Symbol Processing) is also implemented based on KCP, which is why KSP has better performance

  2. KAPT is mainly used to generate new code, and it is difficult to modify the logic of the original code. KCP can make arbitrary modifications to Bytecode or IR, which is more powerful.

KCP development steps

Although KCP is powerful, it is difficult to develop, and the development of a complete KCP involves multiple steps:

  • Gradle Plugin:

    • Plugin : KCP is configured through Gradle, you need to define a Gradle plug-in, and configure the compilation parameters required by KCP in Gradle.
    • Subplugin : Establish a connection from Gradle Plugin to Kotlin Plugin, and pass the parameters configured in Gradle to Kotlin Plugin
  • Kotlin Plugin:

    • CommandLineProcessor : the entry of KCP, defining the id of KCP, parsing command line parameters, etc.
    • ComponentRegister : Register the Extension extension point in KCP. Like CommandLineProcessor, it is called through SPI, and auto-service annotation needs to be added
    • XXExtension : This is where the KCP logic is implemented. Kotlin provides many types of Extension for us to implement. The compiler will call the corresponding type of Extension registered by KCP in each compilation process of the front-end and back-end. For example, ExpressionCodegenExtension can be used to modify the Body of Class; ClassBuilderInterceptorExtension can modify the Definition of Class, etc.

With the upgrade of Kotlin Compiler from K1 to K2, KCP also provides an Extension for K2.

Taking No-arg as an example, No-arg automatically generates a parameterless constructor by adding annotations to Class. There are two sets of Extensions, K1 and K2, in the No-arg source code, which are compatible with the use of different Kotlin versions:

  • No-arg: https://kotlinlang.org/docs/no-arg-plugin.html
  • source:https://cs.android.com/android-studio/kotlin/+/master:plugins/noarg/
  • NoArg K1:

    • CliNoArgDeclarationChecker: NoArg cannot be applied to Inner Class, here use PSI-based front-end checking logic to check whether it is Inner Class
    • CliNoArgExpressionCodegenExtension: Inherited from ExpressionCodegenExtension, based on PSI and corresponding Descriptor, add a parameterless constructor in Class Body in the form of JVM bytecode
  • NoArg K2:

    • FirNoArgDeclarationChecker: New K2 frontend that checks InnerClass based on FIR
    • NoArgIrGenerationExtension: Inherited from IrGenerationExtension, add a parameterless constructor based on IR

Taking Backend Extension as an example, experience the following differences in specific implementation:

  • Processing in CliNoArgExpressionCodegenExtension:
// 1. 基于 descriptor 获取 class 信息
val superClassInternalName = typeMapper.mapClass(descriptor.getSuperClassOrAny()).internalName
val constructorDescriptor = createNoArgConstructorDescriptor(descriptor)
val superClass = descriptor.getSuperClassOrAny()

// 2. 通过 Codegen 直接生成无参构造函数对应的字节码
functionCodegen.generateMethod(JvmDeclarationOrigin.NO_ORIGIN, constructorDescriptor, object : CodegenBased(state) {
    
    
    override fun doGenerateBody(codegen: ExpressionCodegen, signature: JvmMethodSignature) {
    
    
        codegen.v.load(0, AsmTypes.OBJECT_TYPE)

        if (isParentASealedClassWithDefaultConstructor) {
    
    
            codegen.v.aconst(null)
            codegen.v.visitMethodInsn(
                Opcodes.INVOKESPECIAL, superClassInternalName, "<init>",
                "(Lkotlin/jvm/internal/DefaultConstructorMarker;)V", false
            )
        } else {
    
    
            codegen.v.visitMethodInsn(Opcodes.INVOKESPECIAL, superClassInternalName, "<init>", "()V", false)
        }

        if (invokeInitializers) {
    
    
            generateInitializers(codegen)
        }
        codegen.v.visitInsn(Opcodes.RETURN)
    }
})
  • Processing in NoArgIrGenerationExtension:
// 1. 基于 IrClass 获取 Class 信息
val superClass =
    klass.superTypes.mapNotNull(IrType::getClass).singleOrNull {
    
     it.kind == ClassKind.CLASS }
        ?: context.irBuiltIns.anyClass.owner
val superConstructor =
    if (needsNoargConstructor(superClass))
        getOrGenerateNoArgConstructor(superClass)
    else superClass.constructors.singleOrNull {
    
     it.isZeroParameterConstructor() }
        ?: error("No noarg super constructor for ${
      
      klass.render()}:\n" + superClass.constructors.joinToString("\n") {
    
     it.render() })

// 2. 基于 irFactory 等 IR API 创建构造函数
context.irFactory.buildConstructor {
    
    
    startOffset = SYNTHETIC_OFFSET
    endOffset = SYNTHETIC_OFFSET
    returnType = klass.defaultType
}.also {
    
     ctor ->
    ctor.parent = klass
    ctor.body = context.irFactory.createBlockBody(
        ctor.startOffset, ctor.endOffset,
        listOfNotNull(
            IrDelegatingConstructorCallImpl(
                ctor.startOffset, ctor.endOffset, context.irBuiltIns.unitType,
                superConstructor.symbol, 0, superConstructor.valueParameters.size
            ),
            IrInstanceInitializerCallImpl(
                ctor.startOffset, ctor.endOffset, klass.symbol, context.irBuiltIns.unitType
            ).takeIf {
    
     invokeInitializers }
        )
    )
}

NoArgIrGenerationExtension is an IrGenerationExtension, which is an extension point specially used to update Ir. It can be seen that there is no operation on bytecode in it, and various buildXXX APIs in IR are used instead.

The code generation of Compose Compiler is also implemented by IrGenerationExtension, so: Even the earliest version of Compose requires Kotlin version greater than 1.5.10, because its Compiler only supports IR Backend Extension .

Compose Compiler

Compose Compiler is essentially a KCP. After understanding the basic composition of KCP, we know that the core of Compose Compiler lies in Extension

Compose Compiler: https://cs.android.com/androidx/platform/frameworks/support/+/androidx-main:compose/compiler/compiler-hosted/

Find ComposeComponentRegistrar directly to see which Extensions are registered:

class ComposeComponentRegistrar : ComponentRegistrar {
    
    
    //...
    
    StorageComponentContainerContributor.registerExtensio
        project,
        ComposableCallChecker()
    )
    StorageComponentContainerContributor.registerExtensio
        project,
        ComposableDeclarationChecker()
    )
    StorageComponentContainerContributor.registerExtensio
        project,
        ComposableTargetChecker()
    )
    ComposeDiagnosticSuppressor.registerExtension(
        project,
        ComposeDiagnosticSuppressor()
    )
    @Suppress("OPT_IN_USAGE_ERROR")
    TypeResolutionInterceptor.registerExtension(
        project,
        @Suppress("IllegalExperimentalApiUsage")
        ComposeTypeResolutionInterceptorExtension()
    )
    IrGenerationExtension.registerExtension(
        project,
        ComposeIrGenerationExtension(
            configuration = configuration,
            liveLiteralsEnabled = liveLiteralsEnabled,
            liveLiteralsV2Enabled = liveLiteralsV2Enabled
            generateFunctionKeyMetaClasses = generateFunc
            sourceInformationEnabled = sourceInformationE
            intrinsicRememberEnabled = intrinsicRememberE
            decoysEnabled = decoysEnabled,
            metricsDestination = metricsDestination,
            reportsDestination = reportsDestination,
        )
    )
    DescriptorSerializerPlugin.registerExtension(
        project,
        ClassStabilityFieldSerializationPlugin()
    )
    
    //...
}
  • ComposableCallChecker : Checks if a @Composable function can be called
  • ComposableDeclarationChecker : Check if @Composable is in the right place
  • ComposeDiagnosticSuppressor : Shield unnecessary compilation diagnostic errors
  • ComposeIrGenerationExtension : Responsible for code generation of Composable functions
  • ClassStabilityFieldSerializationPlugin : Analyze whether the Class is stable, and add stability information

The various Checkers here are Frontend Extension, which is still implemented based on K1, while ComposeIrGenerationExtension located in Backend is oriented to K2, which is also the core of Compose code generation, and will be introduced in the follow-up articles of this series.

reference

  • Writing Your First Kotlin Compiler Plugin
    https://resources.jetbrains.com/storage/products/kotlinconf2018/slides/5_Writing%20Your%20First%20Kotlin%20Compiler%20Plugin.pdf

  • Kotlin Compiler Internals In 1.4 and beyond

    https://docs.google.com/presentation/d/e/2PACX-1vTzajwYJfmUi_Nn2nJBULi9bszNmjbO3c8K8dHRnK7vgz3AELunB6J7sfBodC2sKoaKAHibgEt_XjaQ/pub?slide=id.g955e8c1462_0_190

Guess you like

Origin blog.csdn.net/vitaviva/article/details/130439482