Why Kryo is faster than Hessian

Serialization Talk

Dubbo RPC is a high-performance, high-throughput remote call method at the core of the dubbo system. I like to call it a multiplexed TCP long connection call. Simply put:

Long connection: avoids creating a new TCP connection every time, and improves the response speed of the call
Multiplexing: A single TCP connection can alternately transmit multiple request and response messages, reducing the idle waiting time of the connection, thereby reducing the number of network connections with the same number of concurrent connections and improving system throughput.

Dubbo RPC is mainly used for remote calls between two dubbo systems, especially suitable for Internet scenarios with high concurrency and small data.

Serialization also plays a crucial role in the response speed, throughput, and network bandwidth consumption of remote calls, and is one of the most critical factors for us to improve the performance of distributed systems.

In dubbo RPC, multiple serialization methods are supported at the same time, such as:

dubbo serialization: Ali has not yet developed a mature and efficient java serialization implementation, Ali does not recommend using it in the production environment
hessian2 serialization: hessian is a cross-language efficient binary serialization method. But this is actually not the native hessian2 serialization, but the hessian lite modified by Ali, which is the serialization method enabled by default in dubbo RPC
JSON serialization: At present, there are two implementations, one is using Ali's fastjson library, and the other is using the simple JSON library implemented by Dubbo, but the implementation is not particularly mature, and the text serialization performance of JSON is Generally not as good as the above two binary serialization.
Java serialization: It is mainly implemented by using the Java serialization that comes with JDK, and the performance is not ideal.

In
general, the performance of these four main serialization methods decreases from top to bottom. For dubbo
RPC, which is a high-performance remote calling method, there are actually only two efficient serialization methods, 1 and 2, which are relatively suitable, and the first dubbo serialization is immature, so only 2 are actually available.
So dubbo RPC uses hessian2 serialization by default.

But hessian is an older serialization implementation, and it is cross-language, so it is not optimized for java alone. Dubbo RPC is actually a Java to Java remote call. In fact, there is no need to use cross-language serialization (of course, cross-language serialization is certainly not excluded).

In recent years, various new efficient serialization methods have emerged one after another, constantly refreshing the upper limit of serialization performance. The most typical ones include:

Specifically for the Java language: Kryo, FST, etc.
Cross-language: Protostuff, ProtoBuf, Thrift, Avro, MsgPack and more

Most of these serialization methods perform significantly better than hessian2 (even the immature dubbo serialization).

In view of this, we introduce two efficient Java serialization implementations, Kryo and FST, for dubbo to gradually replace hessian2.

Among them, Kryo is a very mature serialization implementation, which has been widely used in Twitter, Groupon, Yahoo and many famous open source projects (such as Hive, Storm). And FST is a newer serialization implementation that lacks enough mature use cases, but I think it's still very promising.

For production-oriented applications, I would recommend Kryo as the preferred option for now.

Enable Kryo and FST

Using Kryo and FST is as simple as adding a property to the XML configuration of the dubbo RPC:

<dubbo:protocol name="dubbo" serialization="kryo"/>

<dubbo:protocol name="dubbo" serialization="fst"/>

Register the serialized class

To make Kryo and FST fully performant, it is best to register those classes that need to be serialized into the dubbo system. For example, we can implement the following callback interface:

public class SerializationOptimizerImpl implements SerializationOptimizer {    public Collection<Class> getSerializableClasses() {        List<Class> classes = new LinkedList<Class>();
        classes.add(BidRequest.class);
        classes.add(BidResponse.class);
        classes.add(Device.class);
        classes.add(Geo.class);
        classes.add(Impression.class);
        classes.add(SeatBid.class);        return classes;
    }
}

Then in the XML configuration add:

<dubbo:protocol name="dubbo" serialization="kryo" optimizer="com.alibaba.dubbo.demo.SerializationOptimizerImpl"/>

After registering these classes, serialization performance may be greatly improved, especially for small numbers of nested objects.

Of course, when serializing a class, it may also cascade references to many classes, such as Java collection classes. In response to this situation, we have automatically registered the common classes in the JDK, so you do not need to register them repeatedly (of course, if you register repeatedly, it will not have any effect), including:

GregorianCalendar
InvocationHandler
BigDecimal
BigInteger
Pattern
BitSet
HATE
UUID
HashMap
ArrayList
LinkedList
HashSet
TreeSet
Hashtable
Date
Calendar
ConcurrentHashMap
SimpleDateFormat
Vector
BitSet
StringBuffer
StringBuilder
Object
Object[]
String[]
byte[]
char[]
int[]
float[]
double[]

Since registering the serialized classes is only for performance optimization purposes, it doesn't matter if you forget to register some classes. In fact, Kryo and FST generally outperform hessian and dubbo serialization even without registering any classes.

Of course, one might ask why not register these classes with a configuration file? This is because there are often a large number of classes to be registered, resulting in lengthy configuration files; and without good IDE support, configuration files are much more troublesome to write and refactor than Java classes; finally, these registered classes are generally There is no need to make dynamic modifications after the project is compiled and packaged.

In addition
, some people will also think that manually registering the serialized class is a relatively tedious work. Can it be marked with annotation, and then the system will automatically discover and register it. But
the limitation of annotation here is that it can only be used to mark classes that you can modify, and many classes referenced in serialization are likely to be ones that you cannot modify (such as third-party libraries or JDK system classes or classes of other
projects ) ). In addition, adding annotations slightly "pollutes" the code, making the application code a little more dependent on the framework.

In addition to
annotation, we can also consider other ways to automatically register the serialized classes, such as scanning the class path, automatically discovering the classes that implement the Serializable interface (even including
Externalizable) and registering them. Of course, we know that there may be a lot of Serializable classes that can be found on the classpath, so we can also consider using
the package prefix to limit the scan scope to a certain extent.

Of course, in the automatic registration mechanism, it is particularly necessary to consider how to ensure that both the service provider and the consumer register classes in the same order (or ID) to avoid misalignment. After all, the number of classes that can be discovered and registered at both ends may be the same Different.

No-argument constructor and Serializable interface

If
the serialized class does not contain a parameterless constructor, the performance of Kryo serialization will be greatly reduced, because at this time we will transparently replace Kryo serialization with Java serialization at the bottom layer
. Therefore, it is a best practice to add a no-argument constructor as much as possible to each serialized class (of course, if a java class does not customize a constructor, it will have a no-argument constructor by default).

In addition, Kryo and FST do not need to be serialized and all classes implement the Serializable interface, but we still recommend that every serialized class implement it, because this can maintain compatibility with Java serialization and dubbo serialization. In addition It also makes it possible for us to adopt some of the above automatic registration mechanisms in the future.

Serialization performance analysis and testing

In this article, we mainly discuss serialization, but when doing performance analysis and testing, we do not deal with each serialization method separately, but put them in dubbo RPC for comparison, because it is more practical.

test environment

Roughly as follows:

Two independent servers
4核Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
8G memory
The network between virtual machines passes through a 100M switch
Cent OS 5
JDK 7
Tomcat 7
JVM parameters -server -Xms1g -Xmx1g -XX:PermSize=64M -XX:+UseConcMarkSweepGC

Of course, this test environment is more limited, so the current test results may not have a very authoritative representative.

test script

Staying close to dubbo's own benchmarks:

10 concurrent clients keep making requests:

Pass in a nested complex object (but the amount of single data is small), do nothing, and return it as it is
Pass in a 50K string, do nothing, and return it as it is (TODO: the result is not yet listed)

Conduct a 5-minute performance test. (Quoting the consideration of dubbo's own testing: "It mainly examines the performance of serialization and network IO, so the server does not have any business logic. Taking 10 concurrency is to consider that the http protocol may have a high CPU usage rate under high concurrency. to the bottleneck.")

Comparison of byte sizes generated by different serialization in Dubbo RPC

The size of the bytecode generated by serialization is a relatively deterministic indicator, which determines the network transmission time and bandwidth usage of remote calls.

The results for complex objects are as follows (smaller values are better):

Serialization implementation request bytes response bytes

Kryo	272	90
FST	288	96
Dubbo Serialization	430	186
Hessian	546	329
FastJson	461	218
Json	657	409
Java Serialization	963	630

Comparison of different serialization response time and throughput in Dubbo RPC

The average response time of the remote call method The average TPS (transactions per second)

REST: Jetty + JSON	7.806	1280
REST: Jetty + JSON + GZIP	EVERYTHING	EVERYTHING
REST: Jetty + XML	EVERYTHING	EVERYTHING
REST: Jetty + XML + GZIP	EVERYTHING	EVERYTHING
REST: Tomcat + JSON	2.082	4796
REST: Netty + JSON	2.182	4576
Dubbo: FST	1.211	8244
Dubbo: kyro	1.182	8444
Dubbo: dubbo serialization	1.43	6982
Dubbo: hessian2	1.49	6701
Dubbo: fastjson	1.572	6352

Test summary

As far as the current results are concerned, we can see that Kryo and FST have a very significant improvement compared to the original serialization method in Dubbo RPC, regardless of the size of the generated bytes, the average response time and the average TPS.

future

In the future, when Kryo or FST are mature enough in dubbo, we are likely to change the default serialization of dubbo RPC from hessian2 to one of them.

Please indicate: Xueshi.com » Efficient Java Serialization (Kryo and FST)