What is Service Mesh?

1 What is Service Mesh?

Let us first talk about the term Service Mesh, which is indeed a very, very new term. Like the survey we just investigated, most of the students have never heard of it.

This term was first used by Buoyant, the company that developed Linkerd, and used internally. This term was used publicly for the first time on September 29, 2016. With the introduction of Linkerd in 2017, Service Mesh entered the field of vision of the domestic technology community. It was first translated as "service meshing layer", which was a bit confusing. After a few months of use, it was changed to a service grid. Later I will introduce to you why it is called Grid.

First look at the definition of Service Mesh, which was given by William, CEO of Linkerd. Linkerd is the industry's first Service Mesh, and they created the term Service Mesh, so this definition is more official and authoritative.

Let's take a look at the Chinese translation. First of all, the service grid is an infrastructure layer whose function is to handle the communication between services, and the responsibility is to realize the reliable delivery of requests. In practice, the service mesh is usually implemented as a lightweight network proxy, usually deployed with the application, but transparent to the application.

If you look at this definition directly from the text, you may find it relatively empty and it is not easy to understand what it is. Let’s look at something specific.

The deployment model of Service Mesh, first look at a single request. For a simple request, the client application instance as the request initiator will first send the request to the local Service Mesh instance in a simple way. These are two independent processes, and they are called remotely.

Service Mesh will complete the complete inter-service invocation process, such as service discovery and load balancing, and finally send the request to the target service. This manifests itself as Sidecar.

The Chinese translation of the word “sidecar” means “sidecar”, or “car”. There is also a rustic translation called “sidecar”. Sidecar has been around for a long time. It adds a proxy between the original client and server.

In the case of multiple service invocations, in this figure we can see that the Service Mesh is under all services. This layer is called the dedicated infrastructure layer for inter-service communication. Service Mesh will take over the entire network and forward all requests between services. In this case, we will see that the above service is no longer responsible for delivering the specific logic of the request, but only for completing business processing. The link of communication between services is separated from the application, presenting an abstraction layer.

If there are a large number of services, the grid will appear. In the figure, the green square on the left is the application, the blue square on the right is the Service Mesh, and the blue lines represent the calling relationship between services. The connection between sidecars will form a network, which is the origin of the service mesh name. At this time, the agent reflected is different from the previous sidecar, forming a network.

Let’s review the definitions given earlier and look back at these four key words. First of all, the service grid is abstract, which actually abstracts an infrastructure layer outside of the application. Second, the function is to achieve reliable delivery of requests. Deployed as a lightweight network agent. The last key word is, transparent to the application.

Please pay attention to it. In the above figure, the network may not be particularly obvious in this case. But if the application on the left is removed, and now only the calls between Service Mesh and them are presented, the relationship will be particularly clear at this time, which is a complete network. This is a very important key point in the definition of Service Mesh, which is different from Sidecar: the agent is no longer regarded as a separate component, but the network formed by the connection of these agents is emphasized. In Service Mesh, there is a strong emphasis on the network composed of proxy connections, instead of treating individuals like sidecars.

Now we have basically introduced the definition of Service Mesh clearly, and everyone should be able to roughly understand what Service Mesh is.

2 Evolution of Service Mesh

The second part traces the evolution of Service Mesh to you. It should be noted that although the term Service Mesh did not exist until September 2016, what it expresses appeared a long time ago.

First look at the "Ancient Era", the first generation of network computer systems. At the earliest, developers needed to deal with the details of network communication in their own code, such as data packet sequence, flow control, etc., resulting in a mixture of network logic and business logic. Together, this won't work. Then came the TCP/IP technology, which solved the flow control problem. As you can see from the picture on the right, the function has not changed: all functions are there, and the code still needs to be written. However, the most important thing, process control, has been extracted from the application. Comparing the left and right diagrams, they are extracted as part of the operating system network layer, which is TCP/IP, so the application structure is simple.

Now that it should be written, there is no need to consider how to send the network card. After TCP/IP, this is completely unnecessary. The above is a very remote thing, probably happened fifty years ago.

The era of microservices is also facing similar things. For example, when we are doing microservices, we have to deal with a series of basic things, such as common service registration, service discovery, and load balancing after obtaining server instances. Protect the server to fuse/retry, etc. All the microservices with these functions can't escape, so what should I do? You can only write code and write all the functions in. We found that the earliest microservices were the same as before, with a lot of non-functional code added to the application. In order to simplify development, we started to use class libraries, such as the typical Netflix OSS suite. After doing these things well, the developer's coding problem is solved: only need to write a small amount of code, these functions can be realized. For this reason, in recent years, everyone has seen the popularity of Spring Cloud in the Java community very fast, and it has almost become synonymous with microservices.

At this point, is it perfect? Of course, if it's really perfect, then I won't be standing here today :)

Let's look at these few things that are called pain points: There are more content, and the threshold is higher. Investigate, if you learn Spring Cloud, how long will it take you to master it and apply it in the product to solve the problems that arise? Is one week enough? One week is not enough for most people, and most people need three to six months. Because you will encounter various problems when you actually land, it will take a long time if you can solve it by yourself. Here are the common sub-projects of Spring Cloud. Only the most common parts are listed. There are many contents of the netflix OSS suite under spring cloud netflix. To truly eat Spring Cloud, you need to eat all of these things, otherwise you will be very uncomfortable when you encounter problems.

With so many things, everyone here has a relatively strong learning ability. It may be done in one month, but the question is how long your development team, especially the business development team, will take. This is a very terrible thing: business teams often There are many junior colleagues.

Then things are not so simple. The so-called worse, we have to face a lot of reality.

For example, what are the strengths of our business development team? Will the strongest be technology? No, generally speaking, our business development team's strongest understanding of the business is the familiarity of the entire business system.

The second thing, what is the core value of business applications? We have worked so hard to write so many microservices. Is it to realize microservices? Microservices are just our means. What we ultimately need to achieve is business, which is our real goal.

The third thing is that in terms of microservices, there are more difficult challenges than learning the microservice framework. In the real landing of microservices, there will be a deeper understanding. For example, the split of microservices, such as designing a good API, maintaining stability and easy expansion, and if it involves data consistency across multiple services, most teams will have a headache. Finally, there is Conway's law, but all students who do service will eventually encounter this ultimate problem, and in most cases, they want to cry without tears.

But these are not over yet. The more painful thing than writing a new microservice system is that you have to transform the old system into microservices.

Adding all these together is not enough. One more thing is needed. This one is even more terrible: business development teams are often under great business pressure, and there is never enough time and manpower. Saying going online next month means going online next month, saying double eleven promotion will not push to double twelve. The boss doesn't care if you have time to learn spring cloud, or whether your business team can handle all aspects of microservices. Business always looks at results.

The second pain point is insufficient functions. Here is a list of common functions of service governance. The governance function of Spring Cloud is not powerful enough. If these functions are dealt with one by one, the functions directly provided by Spring Cloud are far from enough. Many functions need to be solved by yourself on the basis of Spring Cloud.

The question is how much time do you plan to invest in human resources to do this. Some people say that I’m a big deal and I don’t do some functions, such as grayscale, and just go online, but this is quite expensive.

The third pain point is cross-language. When microservices first came out, they promised a very important feature: that different microservices can be written in the most suitable programming language that they are best at, favorite, and most suitable. This promise can only be said that half of it is OK, but The other half is not good, it is fake. Because when you implement it, it is usually based on a class library or framework. Once you start coding in a specific programming language, you will find that it seems wrong. why? On the left is the mainstream programming language I listed from the programming language ranking list. The first ones are more familiar to everyone. There are dozens of others not listed behind, and the emerging programming languages ​​are in the middle, which are a bit more niche.

The question now is how many languages ​​we need to provide libraries and frameworks.

This problem is very acute. In order to solve this problem, there are usually only two options:

One is a unified programming language, the whole company uses one programming language

Another option is to write as many libraries as there are programming languages

I believe that if you have students who are doing infrastructure, you must have encountered this problem.

But the problem is not over yet, the framework is written, and it is possible to write a copy of each language. But then there will be a fourth pain point: version upgrade.

Your framework can't be perfect from the beginning, all functions are complete, there are no BUGs, and there is no need to change after distribution. This ideal state does not exist. It must be that 1.0, 2.0, and 3.0 are gradually upgraded, functions are gradually increased, and BUGs are gradually repaired. But after being distributed to users, will users immediately upgrade? Actually impossible.

What to do in this case, there will be inconsistencies between the client and server versions, you must be very careful to maintain compatibility, and then try to urge your users: I am all 3.0, you don't use 1.0, you should upgrade quickly. But if he does not upgrade, you continue to endure, and then try to solve your version compatibility.

How complicated is version compatibility? There are hundreds of servers and thousands of clients, and each version may be different. This is a Cartesian product. But don't forget, there is another problem with the programming language mentioned earlier, you have to multiply N!

Imagine what happens when the framework's Java1.0 client accesses the node.js 3.0 server side, and what happens when the c++ 2.0 client accesses the golang 1.0 server side? Do you want to do all the compatibility tests? In this case, how many cases do you need to write for compatibility testing is almost impossible.

20171102113032_859949510c58941fbe1970cc44ccc297_16.jpeg

What to do then? How to solve these problems is a real problem that we must always face.

Let's think about it:

The first one is where the roots of these problems are: We have done so many painful things, faced so many problems, and these difficult challenges. Do these have anything to do with the service itself? For example, writing a user service and performing CRUD operations on users. Does it have anything to do with the things just mentioned? I found that there is something wrong. These have nothing to do with the service itself, but the communication between services. This is the problem we need to solve.

Then look at what our goal is. All our previous efforts are actually to ensure that the business request sent by the client is sent to the correct place. What is the right place? For example, if there are differences in version, should you go to version 2.0 or version 1.0, what kind of load balancing is needed, and whether to use grayscale. In the end, these considerations are to make the request go to the right place you need.

The third is the nature of the matter. During the entire process, this request has never been changed. For example, the user service we mentioned earlier does CRUD for users, no matter how the request goes, the business semantics will not change. This is the essence of things, something that does not change.

This problem has a high degree of universality: all languages, all frameworks, all organizations, these problems are the same for any microservice.

Having said that, everyone should have a feeling: Is this question similar to which question?

The predecessors fifty years ago, what are their problems? Why does TCP appear and what problems does TCP solve? How is it solved?

The problem that TCP solves is very similar to this. It is to send the request to the right place. All network communications, as long as the TCP protocol is used, these four points are the same.

What will happen after we have TCP? After we have TCP, we develop our applications based on TCP. What do our applications need to do? Does our application need to care about the realization of the link layer under the TCP layer? No need. Similarly, when we develop applications based on HTTP, do applications need to care about the TCP layer?

Why do we care about the communication layer of the service when we develop microservice applications? We learn and do everything in the service communication layer. Why do we do so much?

In this case, another idea naturally arises: since we can move the technology stack of network access down to TCP, can we also move the technology stack of microservices down?

The most ideal state is that we add a microservice layer to the network protocol layer to accomplish this. However, due to standard issues, it is not implemented now. For the time being, this thing should not be realistic. Of course, there may be a network layer of microservices in the future.

There are some pioneers who tried to use proxy solutions, common nginx, haproxy, apache and other proxies. These codes have little to do with microservices, but they provide an idea: insert something between the server and the client to complete the function, avoiding direct communication between the two. Of course, the function of the agent is very simple. The developer has a good idea at first glance, but the function is not enough. What should I do?

In this case, the first generation of Sidecar appeared. The role played by Sidecar is similar to that of an agent, but the functions are much more complete. Basically, the functions implemented by the original microservice framework on the client side will be implemented accordingly.

The first generation of sidecar is mainly the listed companies, of which the most famous is netflix.

In this place, let us mention it. Note that the fourth one is that the first three functions are all foreign companies, but in fact, the sidecar is not only played by foreigners. There are also domestic manufacturers and companies doing similar things. Take Vipshop, for example. When I was working in the infrastructure department of Vipshop, in the first half of 2015, our OSP service framework made a major structural adjustment and added a Sidecar called local proxy. Note that this time is the first half of 2015, which is similar to abroad. I believe there must be similar products in China, but they are not known to the outside world.

Sidecars in this period had limitations. They were all designed for specific infrastructure. They were usually directly tied to the company's own infrastructure and framework that developed Sidecar at that time, and built on the original system. There will be many restrictions. One of the biggest troubles is that it cannot be used universally: there is no way to disassemble it for others to use. For example, Airbnb must use zookeeper, netflix must use eureka, and Vipshop’s local proxy is tied to the osp framework and other infrastructure.

The main reason for these bindings is related to the motivation of these sidecars. For example, netflix is ​​to allow non-JVM language applications to connect to Netflix OSS, and soundcloud is to allow legacy Ruby applications to use the basic settings of the JVM. At that time, the OSP framework of Vipshop, local proxy was to solve the problem of non-Java language access, as well as the aforementioned problem that business departments are unwilling to upgrade. These problems are more troublesome, but they have to be solved because the sidecar is forced to come out.

Because of this special background and requirements, the first-generation Sidecar cannot be used universally because it was originally built on the original system. Although it cannot be taken out alone, it can still work well in the original system, so there is no motivation for stripping. As a result, although many companies have had Sidecar before, it has not been widely spread, because even after it comes out, others will not use it.

One thing to mention here is that in the middle of 2015, we had an idea at that time to separate the local proxy from the OSP and transform it into a universal Sidecar. It is planned to support HTTH1.1, just operate the http header, and the body can be regarded as transparent to us, so that it is easy to achieve universality. Unfortunately, it was not realized due to priority and other reasons, mainly because there were a lot of other tasks such as various business transformations, which was not necessary enough.

Unfortunately, we did not realize this idea at the time. This was in 2015, which was very early. If it was realized at that time, it is very likely that we would have tossed out the industry's first service mesh. It's a pity to think about it now.

However, we are not the only ones who have this idea. There are some people who have similar ideas to us, but fortunately, they have the opportunity to make things. This is the first generation of Service Mesh, a universal sidecar.

The emergence of a general-purpose Service Mesh, the first Linkerd on the left is the industry's first Service Mesh, that is, it created the term Service Mesh. Time: On January 15, 2016, 0.0.7 was released. This is the earliest version seen on github. In fact, this version is very close to the time when we had ideas. Then came the 1.0 version, which was released in April 2017, six months from now. Therefore, Service Mesh is a very new term, and it is normal that everyone has never heard of it.

Next is Envoy, version 1.0 released in 2016.

It should be emphasized here that both Linkerd and Envoy joined the CNCF. Linkerd was in January this year, and Envoy entered in September, which is only one month away. All of you here should all understand the status of CNCF in the field of Cloud Native, right? It can be said that the position of CNCF in Cloud Native is the same as the position of the United Nations in the international order after World War II.

After that, the third Service Mesh, nginmesh, came from the familiar nginx, and the first version was released in September 2017. Because it's so new and just starting out, there is nothing to introduce in particular.

Let's take a look at the difference between Service Mesh and Sidecar. The first two points have already been mentioned:

First, Service Mesh is no longer regarded as a separate component, but emphasizes the network formed by connection

The second Service Mesh is a universal component

Then I want to emphasize the third point. Sidecar is optional and allows direct connection. Usually in the development framework, native language clients like to choose direct connection, and other languages ​​choose to use sidecar. For example, the framework written in java, the java client is directly connected, and the php client uses the sidecar. But you can also choose to use sidecars. For example, Vipshop's OSP uses local proxy for all languages. It is also optional in the sidecar. However, Service Mesh will require complete control of all traffic, that is, all requests must pass through the service mesh.

Next, I will introduce Istio to everyone. I think this thing is a king style. It comes from Google, IBM and Lyft. It is the master of Service Mesh.

Everyone looks at its icon, it is a sailboat. Istio is Greek, and the English meaning is "sail", which translates to sailing. Do you see the association between this name and the icon? Another phenomenon-level product of google in the cloud era, K8S, kubernete also originated in Greek, captain, pilot or helmsman, the icon is a rudder.

The istio name and icon can be said to be in the same line as k8s. This thing was released at 0.1 in May 2017, and 0.2 was released on October 4th just two weeks ago. Everyone is familiar with software development and should understand what stage 0.1/0.2 is in the software iteration. 0.1 is roughly equivalent to the baby just born and 0.2 has not been weaned. However, even in such an early version, my evaluation of him is already a masterpiece, a king style, why?

Why is Istio king? The most important thing is that he brings unprecedented control to Service Mesh. The Service Mesh deployed in the Sidecar mode controls all traffic between services. As long as the Service Mesh can control all traffic, it can also control all requests in the system. For this reason, Istio brings a centralized control panel that allows you to achieve control.

On the left is a single view, with a control panel added to the sidecar to control the sidecar. This picture is not particularly obvious. Look at the picture on the right. When there are a large number of services, the service panel feels clearer. In the entire network, all traffic is under the control of Service Mesh, and all Service Mesh is under the control of the control panel. The entire system can be controlled through the control panel, which is the biggest innovation Istio brings.

Istio was developed by three companies. The first two are scary, Google and IBM, and both are cloud platforms, Google's cloud platform, IBM's cloud platform, and everyone knows the name of GCP in particular. The so-called famous background probably refers to this, right?

Istio's strength is very strong, and I have given a lot of praise here: the design concept is very novel and avant-garde, creative, courageous, pursued, and structured. The strength of the Istio team is also amazing. If you have time, you can go to see the list of istio committees to get a feel for it. Istio is also Google’s new heavyweight product, and it is likely to become the next phenomenal product. What is Google’s current phenomenal product? K8s. And Istio is likely to become the next K8S level product.

Speaking of being born in time, what is momentum? What era are we in today? It is the era of massive popularization of Internet technology, the era of microservice containers in full swing, and the era of Cloud Native. It is also an era when traditional enterprises are undergoing Internet transformation. Today's business users want to transform. This trend is very obvious. Everyone is switching or preparing to switch, but it is inherently insufficient. What is congenital deficiency? No genes, no ability, no experience, no talent, and face all the pain points we mentioned earlier. All that Istio appears now, the timing is very suitable. Don't forget that there is a CNCF background behind istio, which is about to dominate the k8s.

After istio was released, the community responded positively, and the so-called responders gathered. Among them, Envoy, one of the only service meshes on the market, is willing to be the bottom layer for istio, while the other two implementations of Linkerd/nginmesh also directly abandon the confrontation with istio, choose to cooperate and actively integrate with istio. The big guys in the community, like the ones listed here, responded in the first time. They should integrate with istio or build their own products based on istio. Why is it the first time? When istio released version 0.1, they directly stated their attitude and started to stand in line.

Istio's architecture is mainly divided into two major parts. The following data panel is for the traditional service mesh, currently Envoy, but we also mentioned earlier that linkerd and nginmesh are integrating with istio, which refers to replacing Envoy as a data panel.

Another big piece is the control panel above, which is what istio really brings. Mainly divided into three parts, in the figure I list their respective responsibilities and functions that can be achieved.

Because time is limited, it will not be launched today.

Here is an address for everyone, which is an online sharing I did before. The detailed introduction of Istio has a lot of content. Let’s take a closer look.

4D Interpretation: Service Mesh Service Mesh New Generation

Then we also organized a service mesh technical community to translate the istio documentation.

Chinese translation of Istio official documents

http://istio.doczh.cn/

To sum up, the service mesh is a step-by-step approach: from the original proxy, to the sidecar with many restrictions, to the general service mesh, and then to the Istio with enhanced management functions, it will become the next generation of microservices in the future.

Note that it is only a year since the term service mesh appeared.

3 Why choose Service Mesh?

The first three pain points have been solved, and these problems are no longer a problem with Service Mesh. How to solve the pain points of upgrading? Service Mesh is an independent process that can be upgraded separately without changing the application.

The service mesh is to allow clients to access via remote calls. As long as the request can be made, simply send it to servicemesh. The client is extremely simplified. For typical rest requests, almost all languages ​​have perfect support. The server only needs to do one thing, service registration. This makes multilingual support very comfortable. Now you can finally choose a programming language freely.

There is a miracle here, fish and bear's paw have both: at the same time lower the threshold and increase the function. Some students who believe in the conservation of quality will feel unscientific. Pay attention to the reason that these two improvements can be achieved at the same time. It is to give the service mesh to the service mesh. The Service Mesh is universal and can be reused repeatedly.

Service mesh brings changes to the business development team: lowering the entry barrier, providing a stable base, and helping the team achieve technological transformation. The ultimate goal is to free the business development team from the specific technical details of microservice implementation and return to the business.

The second change is to strengthen the operation and maintenance management team. If there are students who are doing operation and maintenance, you can think about it carefully: if you have a service mesh, how much will you manage and control the system? Note that the implementation of many functions is no longer related to the application, and is moving to the service mesh, and the service mesh is usually under the control of operation and maintenance.

Service mesh is a great benefit for emerging niche languages. For new languages, what is the most painful thing when competing with traditional mainstream programming languages? It is ecology, such as various libraries and frameworks. In the field of microservices, it is very difficult for emerging niche languages ​​to compete with Java and others: this is to use their own weaknesses to compete with others' advantages. With Service Mesh, niche languages ​​have the opportunity to avoid this shortcoming, no longer have to compete with Java for ecology, but to give full play to their language characteristics and do what they do best.

 

Guess you like

Origin blog.csdn.net/AntdonYu/article/details/103056556