1-4-3 file structure


Note: This article is a series of articles on school recruitment review.
See the overall catalog for details: School Recruitment Review Catalog

1 Overview

In Java, JVM code can understand is called 字节码(ie extension .classfile), it does not face any particular processor, only for the virtual machine. The Java language solves the problem of low execution efficiency of traditional interpreted languages ​​to a certain extent through bytecode, while retaining the portability of interpreted languages . Therefore, the Java program is more efficient when running, and because the bytecode is not for a specific machine, the Java program can run on computers with many different operating systems without recompiling.

Languages ​​such as Clojure (a dialect of the Lisp language), Groovy, and Scala all run on the Java Virtual Machine. The following figure shows that different languages ​​are compiled into .classfiles by different compilers and finally run on the Java virtual machine. .classThe binary format of the file can be viewed using WinHex .

java virtual machine
It can be said that the .class file is an important bridge between different languages ​​in the Java virtual machine, and it is also a very important reason for supporting Java cross-platform.

2. The overall structure of the Class file

According to the Java Virtual Machine specification, a class file consists of a single ClassFile structure:

ClassFile {
    u4             magic; //Class 文件的标志
    u2             minor_version;//Class 的小版本号
    u2             major_version;//Class 的大版本号
    u2             constant_pool_count;//常量池的数量
    cp_info        constant_pool[constant_pool_count-1];//常量池
    u2             access_flags;//Class 的访问标记
    u2             this_class;//当前类
    u2             super_class;//父类
    u2             interfaces_count;//接口
    u2             interfaces[interfaces_count];//一个类可以实现多个接口
    u2             fields_count;//Class 文件的字段属性
    field_info     fields[fields_count];//一个类会可以有个字段
    u2             methods_count;//Class 文件的方法数量
    method_info    methods[methods_count];//一个类可以有个多个方法
    u2             attributes_count;//此类的属性表中的属性数
    attribute_info attributes[attributes_count];//属性表集合
}

The following describes in detail some of the components involved in the Class file structure.

Schematic diagram of class file bytecode structure organization (previously saved on the Internet, very good, the original source is unknown):

Schematic diagram of byte file structure organization of similar files

2.1 Demons

    u4             magic; //Class 文件的标志

The first four bytes of each Class file are called Magic Numbers. Its only purpose is to determine whether this file is a Class file that can be received by the virtual machine .

Programmers often like to use some special numbers to indicate fixed file types or other special meanings.

2.2 Class file version

    u2             minor_version;//Class 的小版本号
    u2             major_version;//Class 的大版本号

The four bytes following the magic number store the version number of the Class file: the fifth and sixth are the minor version numbers , and the seventh and eighth are the major version numbers .

The high-level Java virtual machine can execute the Class file generated by the low-level compiler, but the low-level Java virtual machine cannot execute the Class file generated by the high-level compiler. Therefore, when we are actually developing, we must ensure that the JDK version of the development and the JDK version of the production environment are consistent.

2.3 Constant pool

    u2             constant_pool_count;//常量池的数量
    cp_info        constant_pool[constant_pool_count-1];//常量池

Immediately after the major and minor version numbers are constant pools, the number of constant pools is constant_pool_count-1 (the constant pool counter is counted from 1, and there is special consideration for emptying the 0th constant. The index value of 0 represents " Do not refer to any constant pool item " ).

** The constant pool mainly stores two major constants: literals and symbol references. ** Literals are closer to the concept of constants in the Java language, such as text strings, constant values ​​declared as final, etc. Symbolic references belong to the concept of compilation principles. The following three types of constants are included:

  • Fully qualified names of classes and interfaces
  • Field name and descriptor
  • Method name and descriptor

Each constant in the constant pool is a table. These 14 kinds of tables have a common feature: the first bit at the beginning is a u1 type flag -tag to identify the type of constant, which represents which constant this constant belongs to. Types of.

Types of Tag description
CONSTANT_utf8_info 1 UTF-8 encoded string
CONSTANT_Integer_info 3 Plastic literal
CONSTANT_Float_info 4 Floating-point literal
CONSTANT_Long_info Long literal
CONSTANT_Double_info Double-precision floating-point literal
CONSTANT_Class_info Symbolic reference to a class or interface
CONSTANT_String_info String type literal
CONSTANT_Fieldref_info Symbolic reference to a field
CONSTANT_Methodref_info 10 Symbolic references to methods in the class
CONSTANT_InterfaceMethodref_info 11 Symbolic references to methods in interfaces
CONSTANT_NameAndType_info 12 Symbolic reference to a field or method
CONSTANT_MothodType_info 16 Flag method type
CONSTANT_MethodHandle_info 15 Method handle
CONSTANT_InvokeDynamic_info 18 Represents a dynamic method invocation point

.classThe file can javap -v class类名look at the information in its constant pool through the command ( javap -v class类名-> temp.txt: output the result to the temp.txt file).

2.4 Access logo

After the end of the constant pool, the next two bytes represent the access flag. This flag is used to identify some class or interface level access information, including: is this Class a class or an interface, whether it is public or abstract, and if it is a class If it is declared final, etc.

Class access and attribute modifiers:

Class access and attribute modifiers

We define an Employee class

package top.snailclimb.bean;
public class Employee {
   ...
}

javap -v class类名Take a look at the access flag of the class through the instruction.

View class access signs

2.5 Current class index, parent class index and interface index collection

    u2             this_class;//当前类
    u2             super_class;//父类
    u2             interfaces_count;//接口
    u2             interfaces[interfaces_count];//一个类可以实现多个接口

The class index is used to determine the fully qualified name of this class, and the parent class index is used to determine the fully qualified name of the parent class of this class. Due to the single inheritance of the Java language, there is only one parent class index, except java.lang.Object All Java classes have a parent class, so except for java.lang.Object, the parent class index of all Java classes is not 0.

The interface index set is used to describe the interfaces that this class implements. These implemented interfaces will be arranged in the interface index set from left to right in the order of the interfaces after implents (if the class itself is an interface).

2.6 Field table

    u2             fields_count;//Class 文件的字段的个数
    field_info     fields[fields_count];//一个类会可以有个字段

Field info (field info) is used to describe the variables declared in the interface or class. Fields include class-level variables and instance variables, but not local variables declared inside methods.

The structure of field info (field table):

Structure of the field table

  • access_flags: Scope field ( public, private, protectedmodifier), is an instance variable or class variable ( staticmodifier), can be serialized (transient modifier), the variability (Final), visibility (volatile modifier, whether forced from Main memory read and write).
  • name_index: a reference to the constant pool, indicating the name of the field;
  • descriptor_index: a reference to the constant pool, indicating the descriptor of the field and method;
  • attributes_count: a field will also have some additional attributes, attributes_count stores the number of attributes;
  • attributes [attributes_count]: Store specific content of specific attributes.

In the above information, each modifier is a Boolean value, either there is a certain modifier or no, it is suitable to use the flag bit. The name of the field and the data type of the field are not fixed. They can only be described by referring to the constants in the constant pool.

The value of the field's access_flags:

The value of the field's access_flags

2.7 Method table collection

    u2             methods_count;//Class 文件的方法的数量
    method_info    methods[methods_count];//一个类可以有个多个方法

methods_count represents the number of methods, and method_info represents the method table.

The description of the method in the storage format of the Class file and the description of the fields are almost completely consistent. The structure of the method table is the same as that of the field table, including the access mark, name index, descriptor index, and attribute table collection in turn.

method_info (method table) structure:

Structure of the method table

The access_flag value of the method table:

Access_flag value of the method table

Note: Because the volatilemodifier and transientmodifier can not be modified method, the method of access flag table does not correspond to these two signs, but adds synchronized, native, abstractand other key modification methods, so there is more for those keywords corresponding symbol .

2.8 Attribute collection

   u2             attributes_count;//此类的属性表中的属性数
   attribute_info attributes[attributes_count];//属性表集合

Class files, field tables, and method tables can all carry their own attribute table collection to describe certain scene-specific information. Unlike the order, length, and content required by other data items in the Class file, the collection of attribute tables is slightly looser. There is no longer a strict order for each attribute table, and as long as it does not duplicate the existing attribute name, anyone The implemented compilers can write their own defined attribute information into the attribute table, and the Java virtual machine will ignore the attributes it does not recognize when it runs.

Published 197 original articles · praised 20 · visits 7995

Guess you like

Origin blog.csdn.net/Xjheroin/article/details/105702301