Service discovery of microservices


Why use service discovery

We assume that you are writing some code that calls services with REST API or Thrift API. In order to send a request, your code needs to know the network location (IP address and port) of the service instance. In traditional applications running on physical hardware, the network location of service instances is relatively static. For example, your code can read the network location from a configuration file that is updated occasionally.

However, in modern cloud-based microservice applications, this is a more difficult problem to solve, as shown in Figure 4-1.

The service instance has a dynamically allocated network location. In addition, due to automatic scaling, failures and upgrades, the entire set of service instances will dynamically change. Therefore, your client code needs to use a more precise service discovery mechanism.

Clients or API gateways that need help looking for services

There are two main service discovery modes: client-side discovery and server-side discovery. Let's take a look at the client discovery first.

Client discovery mode

When using the client discovery mode, the client is responsible for determining the network location of available service instances and requesting load balancing. The client queries the service registry, which is a database of available service instances. After that, the client uses the load balancing algorithm to select an available service instance and make a request.

Figure 4-2 shows the structure of the pattern

The client can undertake the task of discovering the service

The network location of the service instance is registered when the service registry is started. When the instance is terminated, it will be removed from the service registry. The heartbeat mechanism is usually used to periodically refresh the registration information of the service instance.

Netflix OSS provides a good example of a client discovery model. Netflix Eureka is a service registry, which provides a REST API for managing service instance registration and querying available instances. Netflix Ribbon is an IPC client that can be used with Eureka to load balance requests among available service instances.

The client discovery mode has various advantages and disadvantages. This mode is relatively simple, except for the service registry, there are no other moving parts. In addition, because the client can discover available service instances, it can implement intelligent, application-specific load balancing decisions, such as the use of consistent hashing. An important disadvantage of this model is that it couples the client with the service registry. You must implement client service discovery logic for each programming language and framework used by the service client.

Now that we have learned about client-side discovery, let's look at server-side discovery next.

Server discovery mode

Another way of service discovery is the server discovery mode. Figure 4-3 shows the structure of this pattern:

Service discovery can also be handled between servers

AWS Elastic Load Balancer (ELB) is an example of server-side discovery routing. ELB is usually used to load balance external traffic from the Internet. However, you can also use ELB to load balance traffic within a virtual private cloud (VPC). The client uses its DNS name to send the request (HTTP or TCP) through ELB. ELB load balances traffic between a set of registered Elastic Compute Cloud (EC2) instances or EC2 Container Service (ECS) containers. There is no separately visible service registry. In contrast, EC2 instances and ECS containers are registered by ELB itself.

HTTP servers and load balancers (such as NGINX) can also be used as server-side discovery load balancers. For example, use Consul Template to dynamically reconfigure the NGINX reverse proxy. Consul Template is a tool that can periodically regenerate any configuration file from the configuration data stored in the Consul service registry. Whenever the file is changed, it will run arbitrary shell commands. In the example described in the blog post, Consul Template generates a nginx.conf file that configures the reverse proxy, and then runs a command to tell NGINX to reload the configuration command. More complex implementations can use its HTTP API or DNS to dynamically reconfigure NGINX.

Some deployment environments (such as Kubernetes and Marathon ) run an agent on each host in the cluster. These proxies act as server-side discovery load balancers. In order to make a request to the service, the client uses the IP address of the host and the assigned port of the service to route the request through the proxy. After that, the proxy forwards the request transparently to an available service instance running somewhere in the cluster.

The server-side discovery model has several advantages and disadvantages. One of the great advantages of this mode is to abstract the details of discovery from the client. The client only needs to make a request to the load balancer. This eliminates the necessity of implementing discovery logic for every programming language and framework used by the service client. In addition, as mentioned above, some deployment environments provide this feature for free. However, this model has some disadvantages. Unless the load balancer is provided by the deployment environment, you need to introduce this highly available system component, set it up, and manage it.

Service Registry

The service registry is a key part of service discovery. It is a database that contains the network location of service instances. The service registry must be highly available and up-to-date. Although the client can cache the network location obtained from the service registry, the information will eventually expire and the client will not be able to discover the service instance. Therefore, the service registry consists of a cluster of servers that use a replication protocol to maintain consistency.

As mentioned earlier, Netflix Eureka is a good example of a service registry. It provides a REST API for registering and querying service instances. The service instance uses a POST request to register its network location. It must use PUT requests to refresh its registration information every 30 seconds. Remove registration information by using HTTP DELETE request or instance registration timeout. As you might expect, the client can use an HTTP GET request to retrieve the registered service instance.

Netflix achieves high availability by running one or more Eureka servers in each Amazon EC2 Availability Zone. Each Eureka server runs on an EC2 instance with an Elastic IP address. The DNS TEXT record is used to store the Eureka cluster configuration, which is a mapping from the availability zone to the list of network locations of the Eureka server. When the Eureka server starts, it will query DNS to retrieve the Eureka cluster configuration, find its peer, and assign it an unused Elastic IP address.

After the Eureka client-service and service client-query DNS to find the network location of the Eureka server. The client prefers to use the Eureka server in the same availability zone, and if there is no one available, it uses the Eureka server in another availability zone.

The following lists other service book center notes:

  • etcd is a highly available, distributed and consistent key-value store for shared configuration and service discovery. Two well-known projects that use etcd are Kubernetes and Cloud Foundry.
  • Consul is a discovery and configuration service tool. It provides an API that can be used for client registration and discovery services. Consul can perform health checks on services to determine the availability of services.
  • Apache ZooKeeper is a high-performance coordination service widely used in distributed applications. Apache ZooKeeper was originally a Hadoop sub-project, but has now become an independent top-level project.

In addition, as mentioned earlier, some systems, such as Kubernetes, Marathon and AWS, do not have a clear service registry. In contrast, the service registry is only a built-in part of the infrastructure.

Now that we have understood the concept of a service registry, let us see how a service instance is registered to the service registry.

Service registration method

As mentioned earlier, service instances must be registered and deregistered in the service registry. There are several different ways to handle registration and deregistration. One is the self-registration of the service instance, that is, the self-registration mode. The other is to use other system components to manage the registration of service instances, that is, the third-party registration mode. Let's first understand the self-registration mode.

Self-registration mode

When using the self-registration mode, the service instance is responsible for registering and deregistering itself in the service registry. In addition, if necessary, the service instance will send a heartbeat request to prevent its registration information from expiring.

Figure 4-4 shows the structure of this mode.

Service can handle registration by itself

A good example of this approach is the Netflix OSS Eureka client. The Eureka client is responsible for handling all aspects of service instance registration and deregistration. Spring Cloud projects that implement multiple modes including service discovery can easily use Eureka to automatically register service instances. All you need Java Configurationon the type of application @EnableEurekaClientnotes to

There are good and bad self-registration models. One advantage is that it is relatively simple and does not require any other system components. However, the main disadvantage is that it couples the service instance with the service registry. You must implement the registration code for each programming language and framework used by the service.

An alternative method of separating the service from the service registry is the third-party registration model.

Third-party registration model

When using the third-party registration mode, the service instance is no longer responsible for registering itself with the service registry. Instead, the work will be taken care of by another system component called the service registrar. The service registrar tracks changes in the set of running instances by polling the deployment environment or subscribing to events. When it detects a new available service instance, it will register the instance to the service registry. In addition, the service registrar can unregister the terminated service instance.

Figure 4-5 shows the structure of this pattern:

A separate service registrar can be responsible for registering other services

The open source Registrator project is a good example of a service registrar. It can automatically register and unregister as a service instance deployed by a Docker container. The registrar supports multiple service registries, including etcd and Consul.

Another example of a service registrar is NetflixOSS Prana. It is mainly used for services written in non-JVM languages. It is a side application that runs in parallel with the service instance. Prana used Netflix Eureka to register and deregister service instances.

The service registrar is a built-in component in some deployment environments. EC2 instances created by Autoscaling Group can be automatically registered to ELB. The Kubernetes service will automatically register and provide discovery.

There are also good and bad third-party registration models. A major benefit is the decoupling between the service and the service registry. You do not need to implement service registration logic for every programming language and framework used by developers. Instead, only the service instance registration needs to be handled in a centralized manner in a dedicated service.

One disadvantage of this mode is that unless the deployment environment is built-in, you also need to introduce such a highly available system component, and set it up and manage it.

to sum up

In a microservice application, the set of running service instances changes dynamically. The instance has a dynamically allocated network location. Therefore, in order for the client to make a request to the service, it must use the service discovery mechanism.

A key part of service discovery is the service registry. The service registry is a database of available service instances. The service registry provides the functions of management API and query API. The service instance is registered or deregistered from the service registry by using the management API. System components use query APIs to discover available service instances. There are two main service discovery modes: client discovery and server discovery. In a system that uses client service discovery, the client queries the service registry, selects an available instance, and sends a request. In a system that uses server discovery, the client makes a request through a route, and the route will query the service registry and forward the request to an available instance.

There are two main ways for service instances to register and cancel in the service registry. One is that the service instance self-registers with the service registration center, that is, the self-registration mode. The other is to use other system components to complete the registration and deregistration on behalf of the service, that is, the third-party registration mode.

In some deployment environments, you need to use a service registry such as Netflix Eureka or Apache ZooKeeper to set up your own service discovery infrastructure. In other deployment environments, service discovery is built-in, such as Kubernetes and Marathon, which can handle the registration and deregistration of service instances. They also run an agent on each cluster host that plays the role of server-side discovery routing.

An HTTP reverse proxy and load balancer (such as NGINX) can also be used as a server-side discovery load balancer. The service registry can push routing information to NGINX and invoke a normal configuration update; for example, you can use Consul Template. NGINX Plus supports an additional dynamic reconfiguration mechanism — it can use DNS to extract information about service instances from the registry and provide an API for remote reconfiguration.


Guess you like

Origin blog.51cto.com/2096101/2677883