Five minutes to learn the back-end technology: how to learn Java engineer must be of RPC

statement

This switched developer.51cto.com/art/201906/...

What is RPC

RPC (Remote Procedure Call): Remote Procedure Call, which is a request for service from a remote computer through a network, without the need to understand the idea of ​​the underlying network technology.

RPC is a technology idea and not a specification or protocol, and a common RPC technology framework are:

  • Application-level services framework: Ali Dubbo / Dubbox, Google gRPC, Spring Boot / Spring Cloud.
  • Remote communication protocols: RMI, Socket, SOAP (HTTP XML), REST (HTTP JSON).
  • Communication Framework: MINA and Netty.

RPC popular open-source framework, or more, there is Alibaba Dubbo, Facebook's Thrift, Google's gRPC, Twitter's Finagle and so on.

RPC common framework

  • gRPC: is Google released the open-source software, based on the latest HTTP 2.0 protocol and supports many common programming languages. RPC framework is based on the HTTP protocol, using the underlying framework to support Netty.
  • Thrift: Facebook is open source RPC framework, is a major cross-language services development framework. As long as the user secondary development on the line above it, the underlying application for RPC communications are transparent. But this need for users to learn the language characteristics of a particular field, or have a certain cost.
  • Dubbo: Ali Group is a very well-known open source RPC framework that is widely used in many Internet companies and enterprise applications. And serialization framework agreement can plug is extremely distinct characteristics.

Complete RPC framework

In a typical scenario using the RPC, including service discovery, the load, fault-tolerant, network transmission, etc. assembly sequence, wherein "RPC protocol," and it indicates the sequence of how the network transmission procedures.

Figure 1: Complete RPC Chart

The following is Dubbo's design architecture diagram, layered clear, functional complexity:

Figure 2: Dubbo Chart

RPC core functionality

RPC's core function is to implement an RPC most important function module, the image above is the "RPC protocol" section:

Figure 3: RPC core functionality

A core function RPC has five main components, namely: a client, the Stub client, network transmission module, the Stub server, the server and the like.

Figure 4: RPC core function chart

The following describes the essential core RPC framework:

  • Client (Client): caller to the service.
  • Client stub (Client Stub): storing the address information of the server, the data request parameter information packed into the client network message before sending it to the server over the network.
  • Stub server (Server Stub): receiving a request sent by the client unpacks the message and then call the local service process.
  • Server (Server): true service providers.
  • Network Service: the underlying transport, can be TCP or HTTP.

Python comes with RPC Demo

Server.py:

fromSimpleXMLRPCServer importSimpleXMLRPCServer

deffun_add(a,b):

totle = a + b

returntotle

if__name__ == '__main__':

s = SimpleXMLRPCServer(( '0.0.0.0', 8080)) #开启xmlrpcserver

s.register_function(fun_add) #注册函数fun_add

print"server is online..."

s.serve_forever #开启循环等待复制代码

Client.py:

fromxmlrpclib importServerProxy #导入xmlrpclib的包

s = ServerProxy( "http://172.171.5.205:8080") #定义xmlrpc客户端

prints.fun_add( 2, 3) #调用服务器端的函数复制代码

Turn on the server:

Open client:

Wireshark packet capture analysis

Client server go to:

  • Client IP: 172.171.4.176
  • Server IP: 172.171.5.95

Communication protocol using HTTP, XML file transfer format. Transmission fields: methods methodName, 2,3 two parameters.

FIG 5: Request capture

The result returned from the server, returns the value field Value, the result is 5:

FIG 6: Response capture

Use the HTTP protocol in which two transmission network, to establish a TCP three-way handshake, there are four TCP waved off when the HTTP protocol between HTTP protocol.

Figure 7: RPC HTTP protocol based connection process

Detailed procedure call

Python comes with the realization of RPC Demo applets, processes and division of roles can be represented by the following chart:

FIG 8: RPC call detailed flowchart

A RPC call flow is as follows:

  • Consumer Services (Client client) calls the service by way of a local call.
  • After the client stub (Client Stub) receives a method invocation request is responsible, and other information into the reference sequence of (assembled) into the body of the message can be transmitted over the network.
  • The client stub (Client Stub) to find the remote service address, and sends a message to the server over the network.
  • Stub server (Server Stub) decoding (deserialization) after receiving the message.
  • Server stub (Server Stub) call the local service related processing based on the decoding result
  • Server (Server) Local service business processes.
  • Processing results back to the server stub (Server Stub).
  • Server stub (Server Stub) sequence of results.
  • Server stub (Server Stub) sends the result to the consumer through the network.
  • Client stub (Client Stub) receives the message, and decodes (deserialization).
  • Service consumer to get the final result.

RPC's core functions to achieve

RPC's core function consists of five modules, if you want to implement a RPC own, the easiest way to achieve the three technical points are:

  • Addressing
  • Serializing and deserializing the data stream
  • network transmission

Addressing

Addressing using Call ID mapping. In a local call, the function body is directly specified by the function pointer, but in the long-distance call, the function pointer is not enough, because the address space of both processes are completely different.

So in the RPC, all functions must have its own an ID. This ID is the only certainty in all processes.

The client when making remote procedure call, you must include this ID. Then we also need the client and server respectively function and maintain a Call ID correspondence table.

When a client needs a remote call, it will check this table, find the appropriate Call ID, and then pass it on to the server, the server also look-up table to determine the function of the client need to call, then perform the appropriate function code.

Implementation service registry.

To call the service, first you need to register a service center to inquire what are the other service instance. Dubbo service registry is configurable, the official recommended Zookeeper.

Implementing stories: RMI (Remote Method, Invocation, remote method invocation) is the implementation of RPC itself.

Figure 9: RMI Chart

Registry (Service Discovery): With JNDI released and calls RMI service. In fact, JNDI is a registry, the server will serve to put the registry, the client gets the service object from the registry.

RMI service after the service side need to register to achieve the RMI Server, then the client from the specified address RMI Lookup service, the service call methods to complete the corresponding remote method invocation.

Registry is a very important function, when the server finished development services to the external exposure, if the service is not registered, the client is unable to call, even if the service end of the service there.

Serialization and de-serialization

The client how the parameters passed to the function remote it? In a local call, we just need to arguments onto the stack, and then let yourself go function to read the stack on the line.

But in the remote procedure call, the client with the server is a different process, not through memory to pass parameters.

This time we need to put the client first argument turned into a byte stream, passed after the end of the service, then the byte-stream into a format they can read.

  • The binary stream into an object process is called deserialization

This process is called serialization and de-serialization. Similarly, the value returned from the server may also require serialization deserialization process.

Sequence comparison of the advantages and disadvantages mainstream protocol

JSON

Low development cost advantages of an easy to use 23 4 Lightweight exchange of data across non-redundancy language (xml tag Comparative simple closed parenthesis)

1 disadvantage of large volume, high concurrency 2 without affecting the version check, make their own creation and is compatible with 3 segments verification process complicated than the average of 4 XML namespace lead to a lack of information mixed

Summary: The simplest and most common application protocol, widely used, the development of high efficiency, relatively low performance, higher maintenance costs.

protobuf

Protobuf is an effective and scalable format encoding structured data.

1 advantages cross-language, data structures can be customized. 2 field are numbered, newly added field does not affect the old structure. Solve the backward compatibility issues. 3 Automated code generation, simple to use. 4 binary messages, high efficiency and high performance. 5 Netty like integrated frame of the protocol, a coding ××× improve development efficiency.

1 Binary format drawback, poor readability (data capture dump difficult to understand) 2 redundant objects, many fields generated class large space. 3 does not have a default dynamic characteristics (the dynamic type or message may be generated by a dynamic compiler supports defined)

Summary: simple and fast to use, and efficient compatibility, high maintenance costs.

Thrift(Facebook)

Advantage 1 serialization and RPC support one-stop solution, more convenient cross-language than 2 pb, IDL interface definition language, multi-language file is automatically generated 3 provincial traffic, smaller volume 4 contains a complete client / server stack, quickly implement RPC 5 for the server to provide a variety of operating modes, such as thread pool model, non-blocking model

1 large drawback earlier version, there are compatibility issues before 2 0.7 does not support dual-channel 3 rpc non-thread-safe method, the server can easily be linked to death, it requires serialization. 4 does not have the default dynamic characteristics (the dynamic type or message may be generated by a dynamic compiler supports defined) development environment 5, troublesome compiled

Summary: cross-language, simple, initial use of more trouble, the need to avoid the use of scenarios and limitations.

image

network transmission

Network Transmission: remote calls are often used on the network, client and server are connected through a network.

All data is transmitted over the network are required, and therefore there is a need for network transport layer. Network transport layer needs to parameter byte after the Call ID and serialized stream to the server, and then calls the serialized result back to the client.

As long as both can be completed, it can be used as the transport layer. Thus, in fact it uses the protocol is not limited, to complete the transmission line.

Although most RPC framework uses the TCP protocol, UDP but in fact can, and gRPC simply use HTTP2.

TCP connections are the most common, a brief analysis of TCP-based connections:

Usually TCP connection can be connected as needed (need to call when you establish a connection cut off immediately after the end of the call), can also be a long connection (client and server to establish a connection after long-term holders, regardless of the time there send no packets can be detected with a regular heartbeat mechanism established connection is alive valid), multiple remote procedure calls to share the same connection.

Therefore, to achieve a RPC framework, just need to realize the following three points basically completed:

  • Call ID map: function strings may be used directly, can also use the integer ID. General mapping table is a hash table.
  • Serialization deserialization: You can write your own, or you can use Protobuf FlatBuffers like.
  • Network Transmission library: You can write your own Socket, or with Asio, ZeroMQ, Netty and the like.

RPC core of the network transport protocol

It is described in the third to achieve an RPC, the network need to select the transmission mode.

Figure 10: the transmission network

Optional RPC network in a variety of transmission, can be selected TCP protocol, UDP protocol, HTTP protocol.

Each protocol overall performance and efficiency have different impacts on how to choose a correct network transport protocols it? First of all we want to understand a variety of transport protocols work in the RPC.

TCP protocol-based RPC calls

By the caller is connected with the services entering into Socket service, the service by the caller through the Socket interface name will need to call back to the service delivery method name and parameters serialization provider, service provider deserialize and then use reflection to call related methods.

Finally, the results are returned to the service of the caller, so almost the entire call based RPC TCP protocol.

However, in the example applications will be a series of packages, such as RMI is transmitted serializable Java objects in the TCP protocol.

RPC calls based on HTTP protocol

This method is more like a Web page to access the same, except that it returns a single result is more simple.

It is the general process: sending a request to the provider of the service by the caller services, such a request may be a way to GET, POST, PUT, DELETE and the like, the service provider may make the request in accordance with the different ways different processing, or some method only allows certain request methods.

The specific method call is carried out according to the method call URL, and the parameters required for the method may be the result of transmission of the caller to the service XML or JSON data past data analysis, and finally return the data results JOSN or XML.

Because there are a lot of open-source Web server, such as Tomcat, so its easier to implement, just do the same Web project.

Comparison of two ways

RPC calls based on the TCP protocol implementation, due to the TCP protocol stack in the lower, more flexibility to customize the protocol field, reduce network overhead and improve performance and achieve greater throughput and concurrent.

But it requires more attention to detail underlying complexity and higher cost of implementation. While different platforms, such as Andrews, iOS the like, need to develop different kits to transmit requests and corresponding resolution, workload, and a rapid response is difficult to meet user needs.

HTTP protocol-based implementation can use RPC request JSON or XML format and the response data.

The JSON and XML as a universal standard format (using the HTTP protocol also requires serialization and de-serialization, but this is not the content of interest under the agreement, mature Web program has already done the serialized content), open source analysis tool has been quite mature, secondary development will be very convenient and simple on it.

However, since the upper layer protocol is the HTTP protocol, content information transmitted contains the same number of bytes occupied by the transmission using the HTTP protocol will be higher than the number of bytes occupied by the transmission protocol TCP.

Therefore, in the same network, the same transmission content via HTTP protocol, based on data efficiency than TCP protocol efficiency is lower, occupied by the transmission of information takes longer, of course, the compressed data, can close the gap.

RabbitMQ use of RPC architecture

OpenStack using RESTful API calls between the service and the service, and then use RPC calls to each function module in-house services.

Because of the use RPC to decouple the internal service function module, such OpenStack service has scalability, and low coupling.

OpenStack RPC infrastructure added RabbitMQ message queue, the purpose of this is to ensure the safety and stability in RPC messaging process.

How to use OpenStack RabbitMQ implementation calls RPC following analysis.

RabbitMQ Profile

The following excerpt knew almost:

For example, beginners, for a restaurant to explain what these three are right. Not a hundred percent right, but should be sufficient to explain the difference between the three.

RPC: Suppose you are a restaurant waiter, the customer to your order, but you can not cook, so you do the dishes collected Houchu tell the customer what the customer point after point, called RPC (remote procedure call) because the kitchen chef is another person (a process on the computer world is the Remote machine) with respect to the terms of the waiter. Chef do the dishes is RPC return value.

Task queue and message queue: essentially the queue, so it just gives an example of a task queue. Assuming that the restaurant at the peak of many customers, but only a few chefs, the waiters had to press a single list order on the kitchen table, one by one for chefs do, this bunch is the task queue list chefs each finished a dish, it is off the table in order to continue cooking and then remove a list.

Role-sharing in the following figure:

Figure 11: RabbitMQ role in the RPC

The benefits of using RabbitMQ:

  • Synchronous mutation step: You can use the thread pool to become asynchronous synchronization, but the drawback is to achieve its own thread pool, and strong coupling. Message Queuing can easily become asynchronous request synchronization request.
  • Poly low internal high coupling: decoupling, strong reducing dependence.
  • Clipping Flow: maximum request set by the message queue, exceeds the error threshold to discard or screen.
  • Improve network communication performance: TCP overhead of creating and destroying a large, 3-way handshake to create, destroy four times to break up thousands of links to the peak will cause a huge waste of resources, and the operating system handle the number of TCP per second is also a quantitative restrictions, will result in performance bottlenecks. RabbitMQ using channel communication, direct communication without using TCP. A thread one channel, multiple threads plurality of channels, a common TCP connection. A TCP connection of channels can accommodate unlimited (of sufficient capacity, then the hard disk), without causing a performance bottleneck.

Three types of exchangers RabbitMQ

RabbitMQ using the Exchange (switch), and Queue (queues) to implement the message queue.

In RabbitMQ There are three types of switches, each switch type has very distinct characteristics.

Based on these three types of switches, OpenStack call two ways to complete the RPC. First briefly describe three switches.

Figure 12: RabbitMQ Chart

① broadcast exchanger type (Fanout)

This class does not switch analyzes the received message Routing Key, forward the message to the default all bound with the switch queue.

Figure 13: Switch Broadcast

② direct exchange type (Direct)

Such switches require exact matching Routing Key and Binding Key, such as Routing Key = Cloud message, then the message can be forwarded article Binding Key = Cloud messages to the queue.

Figure 14: Direct switches

③ theme exchanger (Topic Exchange)

Such matching Binding Key switch the mode Routing Key message, forwarding the message to all the queues with bounded.

Binding Key support wildcards, where "*" matches a phrase, "#" to match multiple phrases (including zero).

Figure 15: Theme switches

Note: The above four pictures from the blog garden, if infringement, please contact the author: www.cnblogs.com/dwlsxj/p/Ra...

When sending a message producer Routing Key = FCE, this time to meet Queue1 only, it will be routed to the Queue.

Routing Key = ACE If this time will be simultaneously routed to the Queue2 Queue1 and, if the Routing Key = AFB, where a message is only sent to the Queue2.

Nova implement two RabbitMQ based RPC calls:

  • RPC.CALL (call)
  • RPC.CAST (notification)

Wherein RPC.CALL manner based on a request and response, RPC.CAST request only provide one-way, two kinds of ways a typical RPC calls in Nova in both scenarios.

RPC.CALL

RPC.CALL is a bidirectional communication flow, i.e., the receiving system RabbitMQ request message generated by the message producer, consumer message after a respective processing results of the system back to the calling program.

Figure 16: RPC.CALL Schematic

A user creates a virtual machine Dashboard, NOVA-API interface to transmit after the message encapsulation.

NOVA-API as message producers, the message is forwarded to the message queue RPC.CALL manner by Topic exchanger.

At this time, Nova-Compute consumers as a message, and receives the information to start the process performed by the virtual machine corresponding underlying virtualization software.

After the virtual machine user to be successfully started, Nova-Compute a message producer via Direct switch and responding to a message queue virtual machine starts success response message back to Nova-API.

At this Nova-API consumers to receive the message as a message and notifies the user virtual machine starts successfully.

RPC.CALL works in the following figure:

FIG 17: RPC.CALL specific implementation of FIG.

work process:

  • Reply_to queue name specified by the client, correlation_id mark the caller when creating a Message.
  • Through the queue, the server receives the message. Call processing function, and then return.
  • Reply_to return queue is specified queue, and carry correlation_id.
  • Return message reaches the client, the client calls a function which returns a determination according correlation_id.

If there are multiple threads simultaneously for remote method invocation, then there will be a lot of messages sent by the two sides established between the Client Server Socket connection transfer, before and after the order may be random.

After Server processed the results, the results send a message to Client, Client received a lot of messages, how do you know which message is the result of which thread originally called?

Client each thread by calling a remote Socket front interface, generate a unique ID, i.e., Request ID (Request ID is necessary to ensure the connection in which a unique Socket), generally AtomicLong often used to generate a unique ID number from 0 begin accumulating.

RPC.CAST

RPC.CAST remote procedure calls and RPC.CALL similar, but the lack of a system message response process.

Topic producer sends a message to the system message request switch Topic, Topic switch forwards the message queue to the shared message according to the message Routing Key.

Shared message queue coupled to all Topic consumer receives the system message request, and pass it to the server for processing the response.

Its call flow as shown:

Figure 18: RPC.CAST Schematic

Connectivity Design

RabbitMQ implementation of the general design ideas RPC network: consumers are connected to long, short sender is connected. But it can be freely controlled long and short connecting connector.

The average consumer is long connected and ready to receive process messages; and involve RabbitMQ Queues, Exchange of auto-deleted, etc. without special needs do not need short connections. The sender can use short connections, not long occupy the port number, save port resources.

Nova in the RPC code design:

Restful API and RPC simple comparison

RESTful API architecture

REST some of the biggest features are: resources, unified interface, URI, and stateless.

① Resources

The so-called "resource" is an entity on the network, or is a specific information on the network. It can be a piece of text, a picture, a song, a service that is a concrete reality.

② unified interface

RESTful architecture predetermined style, metadata operations, i.e. CRUD (Create, Read, Update, and the Delete, i.e. deletions change check data) operation, respectively, corresponding to the HTTP method: GET to obtain resources, POST for new resources (which may used to update the resource), PUT for updating resources, dELETE to delete the resource, so that a unified interface to data manipulation, only through the HTTP method, you can complete investigation of all additions and deletions to change the working of the data.

③URL

We can point to a resource with a URI (uniform resource locator), i.e. each URI corresponds to a particular resource.

To obtain this resource, you can access its URI, URI therefore become a resource for each address or identifier.

④ stateless

The so-called stateless, that is, all the resources are available through a URI, and this has nothing to do with locating other resources, other resources will not change because of change. There is no difference between state and state, give a simple example to explain.

Such as a query staff wages, wages need to be logged if the query system, query page to enter wage, after the implementation of the relevant steps to obtain the wages, then this case is the state.

Because the query wage every step of operations are dependent on the previous step of the operation, as long as the pre-operation is not successful, subsequent operations can not be executed.

If you enter a URI to obtain the specified wage employees, then this case is stateless, because wages do not get dependent on other resources or status.

And in this case, a resource is wages, with the corresponding URI by one, the resource can be obtained by the HTTP GET method, which is a typical RESTful style.

RPC and Restful API comparison

Face different objects:

  • RPC is more focused on action.
  • REST is the main resource.

RESTful resource-oriented design architecture, but there are many objects can not be abstracted into resources in the system, such as login, change password and other operating resources and RPC can go through the motion. Therefore, the comprehensive operation of the RPC is greater than RESTful.

Transmission efficiency:

  • RPC higher efficiency. The RPC, a custom protocol using TCP, allowing smaller request packets, or using HTTP2 protocol may well reduce the packet size, improve the transmission efficiency.

the complexity:

  • RPC implementation of complex, tedious process.
  • REST calls and testing is very convenient.

RPC implementation (see section I) required to achieve the coding sequence, network transmission, etc. The RESTful do not pay attention to these, RESTful easier to achieve.

flexibility:

  • HTTP relatively more standardized, more standard, more general, no matter what language support the HTTP protocol.
  • RPC can achieve cross-language calls, but overall flexibility as good as RESTful.

to sum up

RPC is mainly used for service calls within the company, the performance of low consumption, high transmission efficiency, implementation complexity.

HTTP is mainly used for external heterogeneous environments, browser interface calls, App interface calls, calls and other third-party interfaces.

RPC usage scenarios (large sites, many internal subsystems, the interface very much the case for the use of RPC):

  • Long link. It does not always go as HTTP communication to be like 3-way handshake, reducing network overhead.
  • Registration release mechanism. RPC framework generally have a registry, there is a wealth of monitoring and management; publish, off the assembly line interfaces, dynamic expansion of the caller is not aware, unified operation.
  • Security, there is no exposure to resource operations.
  • Micro-service support. Is the recent popular service oriented architecture, service governance, RPC framework is a strong support.

Reference article

developer.51cto.com/art/201906/…

www.mamicode.com/info-detail…

Blog

Java technology storehouse "Java Programmer's Guide review"

github.com/h2pl/Java-T…

Java integrated whole network of high-quality learning content to help you from basic to advanced Java systematic review

Interview Guide

The whole network hottest Java Interview Guide, a total of more than 200 pages, very practical, whether for review or preparing for interviews are good. Java technology in public rivers and lakes [No.] reply "PDF" to receive a free.

Guess you like

Origin juejin.im/post/5e81fb5ef265da47b554dc7e