KubeEdge entry to proficiency-Kubeedge principle diagram (transfer)

Thanks for sharing the original text- http://bjbsair.com/2020-04-03/tech-info/29914.html

In addition to making various asynchronous communication channels in kubernetes, Kubeedge guarantees business continuity after offline; it also defines a series of device abstractions to manage edge devices. Moreover, its v1.0 version is moving in the direction of edge service grid and functional computing.

Official document: https://docs.kubeedge.io/en/latest/

Architecture

K8s series-Kubeedge realization principle

The overall architecture diagram is relatively clear. Without considering the edgesite, its architecture is divided into cloud and edge. In fact, it can be understood as the management side of kubernetes and the kubelet node side (corresponding to the edge end). But please note that the scenario here is edge computing, which means that the network environment at the edge is difficult to guarantee.

Cloud communication

So the cloud Hub on the cloud side and the Edge Hub on the edge side were derived. The two modules communicate through websocket or quic, which is equivalent to establishing an underlying communication tunnel for k8s to communicate with other applications. Of course, what protocol to use for communication is not the key point. The key point is how to ensure that the business is not affected when the links between them cannot be guaranteed. This is the problem that MetaManager wants to solve.

  • CloudHub mentioned earlier that cloudHub on the cloud side is the server side of a tunnel, used for a large number of edge sides to connect based on websocket or quic protocol; yes, this is a serious setter, responsible for pimping every day.
  • EdgeHub runs on the edge side and is the client side of the tunnel. It is responsible for forwarding the received information to the edge modules for processing; at the same time, it sends messages from each edge module to the cloud side through the tunnel.

Edge

  • The back end of the MetaManager MetaManager module corresponds to a local database (sqlLite). All other modules that need to communicate with the cloud side will be saved in the local DB. When data needs to be queried, if the data exists in the local DB, it will Obtain it locally, which avoids frequent network interactions with the cloud end; at the same time, in the case of network interruption, the local cached data can also ensure its stable operation (for example, your smart car enters a wireless signal In the tunnel), after the communication is restored, resynchronize the data.
  • Edged mentioned the kubelet of kubernetes before, which is equivalent to the core of k8s. This part is actually simply tailored to remove some unused functions, and then become the Edged module. This module is to ensure the pod delivered by the cloud and its corresponding configurations and storage (functional computing will be supported later) It can run stably on the edge side, and provide automatic detection and fault recovery capabilities after abnormalities. Of course, due to the runtime development of k8s itself, it should be relatively easy for this module to support various CRIs.
  • The modules mentioned in EventBus/ServiceBus/Mappper are all directly or indirectly related to k8s. Next, let's talk about the device management side related to the device (or the real IOT business). The access of external devices currently supports MQTT and Rest-API, which correspond to EventBus and ServiceBus respectively. EventBus is a client of MQTT broker. The main function is to convert the message of each module on the edge side and the event reported to MQTT by the device mapper. ServiceBus is the conversion component corresponding to the external Rest-API access. Speaking of this, it is necessary to mention MQTT broker. In fact, most Internet users have used message middleware like rabbitmq and activeMQ. In fact, they support the MQTT protocol (it can be understood as a streamlined version of AMQP). Various IoT devices may directly support MQTT, but some only support Bluetooth or other near field communication protocols. It doesn't matter, Mapper can realize the conversion of various protocols into subscription and publication of MQTT, so as to realize the communication with the edge. Of course, ServiceBus is applicable to services that support the http protocol.
  • At the end of DeviceTwin edge, there is only one DeviceTwin module left. To understand this term, you have to mention the concept of digital twin. Here is science fiction. It is assumed that humans want to move the world, but it is a bit difficult that this time is to move you to Mars. How to do? Here is a solution: After scanning all your biological information on the earth, after generating a data packet with your complete biological characteristics, then you will be destroyed on the earth. Then send the data packet describing your complete information to Mars at the speed of electric wave and light, and let the Martian device use the received biological characteristics to create a you. Isn't it feasible! _ Looking back, the digital twin we are talking about is the data packet that is used to transmit to Mars to describe all your biological characteristics; of course, the corresponding information here is the access device information. Therefore, DeviceTwin saves this information in the local DB, and processes operations based on the cloud side to modify certain attributes of the device (that is, operating the device); at the same time, it synchronizes the status information reported by the device based on eventBus to the local DB and cloud side. middleman.

Cloud

Controller
then talks about controller. Actually, controller is composed of edgeController used to synchronize information between the edge and API-Server and deviceController used to synchronize device CRD information between DeviceTwin and API-Server. These two modules are relatively simple and will be explained in detail later.

Implementation of each module

Edge

Entrance and beehive

The beehive module plays a very important role in the entire kubeedge. It implements a set of Module management interfaces. The startup, operation, and communication between modules in the program are all packaged and managed uniformly. The following figure is the main startup process of kubeedge's edge code. The modules involved here are provided by beehive.

K8s series-Kubeedge realization principle

It can be seen that during initialization, the init function of each edge-side module is loaded separately to register its modules in the heehive framework. Then traverse the start (StartModules) in core.Run.

In addition, it is worth mentioning that the function for communicating between modules and sending messages to group/modules is actually communicated through channels in beehive. This is also the way of communication between goroutines recommended by golang.

EdgeHub

K8s series-Kubeedge realization principle

The key point is to start two go routines to implement message reception and distribution in two directions. Here go ehc.routeToEdge corresponds to receiving the message sent from the cloud end to the edge end from the tunnel endpoint, and then calls ehc.dispatch to parse the target module of the message and forward it based on the communication mechanism of the message between the heehive modules.

In the same way, go ehc.routeToCloud realizes that the edge message is forwarded to the cloudHub module of the cloud based on the tunnel for processing. Of course, this module implements the logic of waiting until the timeout process for the response of the synchronization message. When the response message is received during the timeout period, it will be forwarded to the message sender module. The more violent thing is that once the message to the cloud fails, the goroutine will exit, notify all modules that it is currently disconnected from the cloud, and then re-initiate the connection.

When metaManager is disconnected from the cloud, it will use the data in the local DB and will not initiate queries to the cloud.

Edged

K8s series-Kubeedge realization principle

This block is basically the code that calls the kubelet, and the more implementation is the startup process. In addition, the previous kubelet client is used as the fake interface of the fake, and the data is stored in the metaManager through the metaClient, so as to directly access the api-server operation before proxy. For this piece of learning, you can refer to an article that analyzed the kubelet architecture. Introduction to the Kubelet source code architecture.

The differentiated piece of code here is implemented in e.syncPod, which performs operations on local pods by reading the pod task list of metaManager and EdgeController. At the same time, the configmap and secret associated with these pods will also be processed along with the pod processing. The operation of the pod is also based on a queue of an operation category. For example, e.podAddWorkerRun starts a goroutine used to consume the queue of the added pod. The external package is basically the same, and the internal package is handled entirely by referencing the kubelet native package.

MetaManager

K8s series-Kubeedge realization principle

From the code architecture, the module is relatively simple. First, the outer layer sends a message to itself according to a certain period to trigger the timing synchronization of the pod state to the cloud side. In addition, start an independent goroutine in mainLoop to receive external messages and execute processing logic.

The processing logic is classified based on the message type, including:

  1. Add, delete, check, and modify initiated by the cloud
  2. Query request initiated by edge module (As mentioned earlier, remote query is not initiated when the status is disconnect)
  3. The result of the query response returned by the cloud side
  4. A message sent from edgeHub to update the connection status with cloudHub
  5. A message sent by myself to periodically synchronize the pod status on the edge end to the cloud end
  6. News about functional calculations

The key point is to add, delete, check and modify, and take the addition as an example. When a resource to be added is received, the resource will be parsed out, organized into a triple of key, type, and value, and stored in a local SqlLite database in a way similar to simulating NoSQL. The purpose of such preservation is also to facilitate quick retrieval, additions and deletions. After saving, you need to send the response message to the source module of the request message.

EventBus and ServiceBus

K8s series-Kubeedge realization principle

  • EventBus

eventBus is used to connect MQTT Broker and beehive. MQTT broker has several startup modes, which are divided into:

  1. Use embedded MQTT broker
  2. Use external MQTT broker

In the embedded MQTT broker mode, eventBus starts the broker package gomqtt implemented by golang to access external MQTT devices. For specific usage, please refer to its github project homepage. EventBus has done some common operations in both modes, including:

  1. Subscribe to the broker to follow topics, as follows:

SubTopics = []string{

"$hw/events/upload/#",

"$hw/events/device/+/state/update",

"$hw/events/device/+/twin/+",

"$hw/events/node/+/membership/get",

"SYS/dis/upload_records",

}

2. When the corresponding event is received, the callback function onSubscribe is triggered

3. In the callback function, events are simply classified and sent to different destinations (DeviceTwin or EventHub)

All events in $hw/events/device/+/twin/+ and $hw/events/node/+/membership/gettopic are sent to DeviceTwin, and other events are sent directly to EventHub and then synchronized to the Cloud side.

Of course, this part also includes the interface for creating a client and publishing events to the MQTT broker, which will not be expanded here.

  • ServiceBus

ServiceBus starts a goroutine to receive the message from beehive, and then based on the parameters in the message, the message is sent to the target APP on the local 127.0.0.1 via REST-API by calling the http client. This is equivalent to a client, and the APP is an http Rest-API server, all operations and device status need to be issued and obtained by the client calling the interface.

DeviceTwin

K8s series-Kubeedge realization principle

DeviceTwin includes the following functions:

  1. In terms of data storage, the device data is stored in the local storage sqlLite, including three tables: device, deviceAttr and deviceTwin.
  2. Process the messages sent by other modules to the twin module, and then call dtc.distributeMsg to process the messages. In the message processing logic, the message is divided into four categories and sent to the actions of these four categories for execution processing (each category contains multiple actions): membershipdevicecommunicationtwin

Since this part is more closely related to the equipment, why is it divided into several categories and how to abstract it? The understanding of this part is not thorough enough. We only focus on its main business logic for the time being. The official document has a more detailed description of this part devicetwin.

Cloud

Entrance

K8s series-Kubeedge realization principle

The focus here is on the three parts of cloudHub, controller (that is, edgeController) and devicecontroller loaded in init. Then, like the edge side, it is a beehive routine, calling StartModules to start all modules.

CloudHub

K8s series-Kubeedge realization principle

handler.WebSocketHandler.ServeEvent receives the connection of the new edge node on the websocket server, and then allocates a channel queue for the new node. Further, the message is handed over to the logical processing responsible for content reading and writing.

channelq.NewChannelEventQueue maintains a corresponding channel queue for each edge node (there is a buffer of 10 messages by default), and then calls go q.dispatchMessage to receive the message sent by the controller to clouHub, and analyzes its destination node based on the content of the message. Then send the message to the channel corresponding to the node to queue for processing.

The core logic of clouHub includes two parts, reading and writing:

  1. As mentioned earlier, the message that needs to be sent to the edge node will be sent to the channel queue corresponding to the node, which is read in the channel through handler.WebSocketHandler.EventWriteLoop, and is responsible for sending processing based on the tunnel (there are also many judgments, such as if you find Sending will be terminated if the corresponding node node is not found, or the node node is offline, etc.).
  2. On the other hand, the handler.WebSocketHandler.EventReadLoop function reads the message from the edge side from the tunnel, and then sends the message to the controller module for processing (if it is a keepalive heartbeat message, ignore it).

If cloudHub fails to send a message to the node, it will trigger the CancelNode operation of EventHandler; if combined with the behavior of the edgeHub side, we know that edgeHub will re-initiate a new connection to the cloud side, and then go through the synchronization process again.

Controller(EdgeController)

K8s series-Kubeedge realization principle

The core logic of the controller is upstream and downstream.

  • The upstream receives the message sent by beehive to the controller, and then forwards it to different goroutines for processing based on the message resource type through go uc.dispatchMessage. This includes nodeStatus, podStatus, queryConfigMap, querySecret, queryService, queryEndpoints, etc.; various types of operations are to call k8s client code to write the node status to the API-Server.
  • Downstream monitors the changes of various resources by calling the k8s client code. For example, for pod, it reads the message through dc.podManager.Events, and then calls dc.messageLayer.Send to send the message to the edge for processing. This is also the same as upstream, including pod, configmap, secret, nodes, services and endpoints.

DeviceController

K8s series-Kubeedge realization principle

deviceController is the same as edgeController, except that the resource it cares about is no longer a subresource of k8s workload, but a CRD defined for device, including: device and deviceModel. Since the main logic is all through edgeControler, no detailed introduction will be given here. Thanks for sharing the original text- http://bjbsair.com/2020-04-03/tech-info/29914.html

In addition to making various asynchronous communication channels in kubernetes, Kubeedge guarantees business continuity after offline; it also defines a series of device abstractions to manage edge devices. Moreover, its v1.0 version is moving in the direction of edge service grid and functional computing.

Official document: https://docs.kubeedge.io/en/latest/

Architecture

K8s series-Kubeedge realization principle

The overall architecture diagram is relatively clear. Without considering the edgesite, its architecture is divided into cloud and edge. In fact, it can be understood as the management side of kubernetes and the kubelet node side (corresponding to the edge end). But please note that the scenario here is edge computing, which means that the network environment at the edge is difficult to guarantee.

Cloud communication

So the cloud Hub on the cloud side and the Edge Hub on the edge side were derived. The two modules communicate through websocket or quic, which is equivalent to establishing an underlying communication tunnel for k8s to communicate with other applications. Of course, what protocol to use for communication is not the key point. The key point is how to ensure that the business is not affected when the links between them cannot be guaranteed. This is the problem that MetaManager wants to solve.

  • CloudHub mentioned earlier that cloudHub on the cloud side is the server side of a tunnel, used for a large number of edge sides to connect based on websocket or quic protocol; yes, this is a serious setter, responsible for pimping every day.
  • EdgeHub runs on the edge side and is the client side of the tunnel. It is responsible for forwarding the received information to the edge modules for processing; at the same time, it sends messages from each edge module to the cloud side through the tunnel.

Edge

  • The back end of the MetaManager MetaManager module corresponds to a local database (sqlLite). All other modules that need to communicate with the cloud side will be saved in the local DB. When data needs to be queried, if the data exists in the local DB, it will Obtain it locally, which avoids frequent network interactions with the cloud end; at the same time, in the case of network interruption, the local cached data can also ensure its stable operation (for example, your smart car enters a wireless signal In the tunnel), after the communication is restored, resynchronize the data.
  • Edged mentioned the kubelet of kubernetes before, which is equivalent to the core of k8s. This part is actually simply tailored to remove some unused functions, and then become the Edged module. This module is to ensure the pod delivered by the cloud and its corresponding configurations and storage (functional computing will be supported later) It can run stably on the edge side, and provide automatic detection and fault recovery capabilities after abnormalities. Of course, due to the runtime development of k8s itself, it should be relatively easy for this module to support various CRIs.
  • The modules mentioned in EventBus/ServiceBus/Mappper are all directly or indirectly related to k8s. Next, let's talk about the device management side related to the device (or the real IOT business). The access of external devices currently supports MQTT and Rest-API, which correspond to EventBus and ServiceBus respectively. EventBus is a client of MQTT broker. The main function is to convert the message of each module on the edge side and the event reported to MQTT by the device mapper. ServiceBus is the conversion component corresponding to the external Rest-API access. Speaking of this, it is necessary to mention MQTT broker. In fact, most Internet users have used message middleware like rabbitmq and activeMQ. In fact, they support the MQTT protocol (it can be understood as a streamlined version of AMQP). Various IoT devices may directly support MQTT, but some only support Bluetooth or other near field communication protocols. It doesn't matter, Mapper can realize the conversion of various protocols into subscription and publication of MQTT, so as to realize the communication with the edge. Of course, ServiceBus is applicable to services that support the http protocol.
  • At the end of DeviceTwin edge, there is only one DeviceTwin module left. To understand this term, you have to mention the concept of digital twin. Here is science fiction. It is assumed that humans want to move the world, but it is a bit difficult that this time is to move you to Mars. How to do? Here is a solution: After scanning all your biological information on the earth, after generating a data packet with your complete biological characteristics, then you will be destroyed on the earth. Then send the data packet describing your complete information to Mars at the speed of electric wave and light, and let the Martian device use the received biological characteristics to create a you. Isn't it feasible! _ Looking back, the digital twin we are talking about is the data packet that is used to transmit to Mars to describe all your biological characteristics; of course, the corresponding information here is the access device information. Therefore, DeviceTwin saves this information in the local DB, and processes operations based on the cloud side to modify certain attributes of the device (that is, operating the device); at the same time, it synchronizes the status information reported by the device based on eventBus to the local DB and cloud side. middleman.

Cloud

Controller
then talks about controller. Actually, controller is composed of edgeController used to synchronize information between the edge and API-Server and deviceController used to synchronize device CRD information between DeviceTwin and API-Server. These two modules are relatively simple and will be explained in detail later.

Implementation of each module

Edge

Entrance and beehive

The beehive module plays a very important role in the entire kubeedge. It implements a set of Module management interfaces. The startup, operation, and communication between modules in the program are all packaged and managed uniformly. The following figure is the main startup process of kubeedge's edge code. The modules involved here are provided by beehive.

K8s series-Kubeedge realization principle

It can be seen that during initialization, the init function of each edge-side module is loaded separately to register its modules in the heehive framework. Then traverse the start (StartModules) in core.Run.

In addition, it is worth mentioning that the function for communicating between modules and sending messages to group/modules is actually communicated through channels in beehive. This is also the way of communication between goroutines recommended by golang.

EdgeHub

K8s series-Kubeedge realization principle

The key point is to start two go routines to implement message reception and distribution in two directions. Here go ehc.routeToEdge corresponds to receiving the message sent from the cloud end to the edge end from the tunnel endpoint, and then calls ehc.dispatch to parse the target module of the message and forward it based on the communication mechanism of the message between the heehive modules.

In the same way, go ehc.routeToCloud realizes that the edge message is forwarded to the cloudHub module of the cloud based on the tunnel for processing. Of course, this module implements the logic of waiting until the timeout process for the response of the synchronization message. When the response message is received during the timeout period, it will be forwarded to the message sender module. The more violent thing is that once the message to the cloud fails, the goroutine will exit, notify all modules that it is currently disconnected from the cloud, and then re-initiate the connection.

When metaManager is disconnected from the cloud, it will use the data in the local DB and will not initiate queries to the cloud.

Edged

K8s series-Kubeedge realization principle

This block is basically the code that calls the kubelet, and the more implementation is the startup process. In addition, the previous kubelet client is used as the fake interface of the fake, and the data is stored in the metaManager through the metaClient, so as to directly access the api-server operation before proxy. For this piece of learning, you can refer to an article that analyzed the kubelet architecture. Introduction to the Kubelet source code architecture.

The differentiated piece of code here is implemented in e.syncPod, which performs operations on local pods by reading the pod task list of metaManager and EdgeController. At the same time, the configmap and secret associated with these pods will also be processed along with the pod processing. The operation of the pod is also based on a queue of an operation category. For example, e.podAddWorkerRun starts a goroutine used to consume the queue of the added pod. The external package is basically the same, and the internal package is handled entirely by referencing the kubelet native package.

MetaManager

K8s series-Kubeedge realization principle

From the code architecture, the module is relatively simple. First, the outer layer sends a message to itself according to a certain period to trigger the timing synchronization of the pod state to the cloud side. In addition, start an independent goroutine in mainLoop to receive external messages and execute processing logic.

The processing logic is classified based on the message type, including:

  1. Add, delete, check, and modify initiated by the cloud
  2. Query request initiated by edge module (As mentioned earlier, remote query is not initiated when the status is disconnect)
  3. The result of the query response returned by the cloud side
  4. A message sent from edgeHub to update the connection status with cloudHub
  5. A message sent by myself to periodically synchronize the pod status on the edge end to the cloud end
  6. News about functional calculations

The key point is to add, delete, check and modify, and take the addition as an example. When a resource to be added is received, the resource will be parsed out, organized into a triple of key, type, and value, and stored in a local SqlLite database in a way similar to simulating NoSQL. The purpose of such preservation is also to facilitate quick retrieval, additions and deletions. After saving, you need to send the response message to the source module of the request message.

EventBus and ServiceBus

K8s series-Kubeedge realization principle

  • EventBus

eventBus is used to connect MQTT Broker and beehive. MQTT broker has several startup modes, which are divided into:

  1. Use embedded MQTT broker
  2. Use external MQTT broker

In the embedded MQTT broker mode, eventBus starts the broker package gomqtt implemented by golang to access external MQTT devices. For specific usage, please refer to its github project homepage. EventBus has done some common operations in both modes, including:

  1. Subscribe to the broker to follow topics, as follows:

SubTopics = []string{

"$hw/events/upload/#",

"$hw/events/device/+/state/update",

"$hw/events/device/+/twin/+",

"$hw/events/node/+/membership/get",

"SYS/dis/upload_records",

}

2. When the corresponding event is received, the callback function onSubscribe is triggered

3. In the callback function, events are simply classified and sent to different destinations (DeviceTwin or EventHub)

All events in $hw/events/device/+/twin/+ and $hw/events/node/+/membership/gettopic are sent to DeviceTwin, and other events are sent directly to EventHub and then synchronized to the Cloud side.

Of course, this part also includes the interface for creating a client and publishing events to the MQTT broker, which will not be expanded here.

  • ServiceBus

ServiceBus starts a goroutine to receive the message from beehive, and then based on the parameters in the message, the message is sent to the target APP on the local 127.0.0.1 via REST-API by calling the http client. This is equivalent to a client, and the APP is an http Rest-API server, all operations and device status need to be issued and obtained by the client calling the interface.

DeviceTwin

K8s series-Kubeedge realization principle

DeviceTwin includes the following functions:

  1. In terms of data storage, the device data is stored in the local storage sqlLite, including three tables: device, deviceAttr and deviceTwin.
  2. Process the messages sent by other modules to the twin module, and then call dtc.distributeMsg to process the messages. In the message processing logic, the message is divided into four categories and sent to the actions of these four categories for execution processing (each category contains multiple actions): membershipdevicecommunicationtwin

Since this part is more closely related to the equipment, why is it divided into several categories and how to abstract it? The understanding of this part is not thorough enough. We only focus on its main business logic for the time being. The official document has a more detailed description of this part devicetwin.

Cloud

Entrance

K8s series-Kubeedge realization principle

The focus here is on the three parts of cloudHub, controller (that is, edgeController) and devicecontroller loaded in init. Then, like the edge side, it is a beehive routine, calling StartModules to start all modules.

CloudHub

K8s series-Kubeedge realization principle

handler.WebSocketHandler.ServeEvent receives the connection of the new edge node on the websocket server, and then allocates a channel queue for the new node. Further, the message is handed over to the logical processing responsible for content reading and writing.

channelq.NewChannelEventQueue maintains a corresponding channel queue for each edge node (there is a buffer of 10 messages by default), and then calls go q.dispatchMessage to receive the message sent by the controller to clouHub, and analyzes its destination node based on the content of the message. Then send the message to the channel corresponding to the node to queue for processing.

The core logic of clouHub includes two parts, reading and writing:

  1. As mentioned earlier, the message that needs to be sent to the edge node will be sent to the channel queue corresponding to the node, which is read in the channel through handler.WebSocketHandler.EventWriteLoop, and is responsible for sending processing based on the tunnel (there are also many judgments, such as if you find Sending will be terminated if the corresponding node node is not found, or the node node is offline, etc.).
  2. On the other hand, the handler.WebSocketHandler.EventReadLoop function reads the message from the edge side from the tunnel, and then sends the message to the controller module for processing (if it is a keepalive heartbeat message, ignore it).

If cloudHub fails to send a message to the node, it will trigger the CancelNode operation of EventHandler; if combined with the behavior of the edgeHub side, we know that edgeHub will re-initiate a new connection to the cloud side, and then go through the synchronization process again.

Controller(EdgeController)

K8s series-Kubeedge realization principle

The core logic of the controller is upstream and downstream.

  • The upstream receives the message sent by beehive to the controller, and then forwards it to different goroutines for processing based on the message resource type through go uc.dispatchMessage. This includes nodeStatus, podStatus, queryConfigMap, querySecret, queryService, queryEndpoints, etc.; various types of operations are to call k8s client code to write the node status to the API-Server.
  • Downstream monitors the changes of various resources by calling the k8s client code. For example, for pod, it reads the message through dc.podManager.Events, and then calls dc.messageLayer.Send to send the message to the edge for processing. This is also the same as upstream, including pod, configmap, secret, nodes, services and endpoints.

DeviceController

K8s series-Kubeedge realization principle

deviceController is the same as edgeController, except that the resource it cares about is no longer a subresource of k8s workload, but a CRD defined for device, including: device and deviceModel. Since the main logic is all through edgeControler, no detailed introduction will be given here.

Guess you like

Origin blog.csdn.net/wxb880114/article/details/109193787