Vxlan principle of flannel

Vxlan principle of flannel

kubernetes network communication

  • Container communication between pods (lo)
  • Pod communication between pods <-----> pod IP (flannel, calico)
  • Communication between Pod and Service podIP <-----> ClusterIP (iptables, ipvs)
  • Service and communication outside the cluster ClusterIP <-----> outside the cluster

CNI plugin:

  • flannel
  • calico
  • canel
  • kube-router

Flannel

Flannel itself is a framework , and truly providing network functions is his back-end implementation . Currently supports three back-end implementations :

  • VXLAN
  • host-see
  • UDP

It can be seen from the figure that each host has a flannel1 device, which is the VTEP device required by VXLAN (that is, flannel1 "used to encapsulate and decapsulate VXLAN packets"), which has both an IP address and a MAC address. Now we are accessing container2 from container1. When container1 sends a request, the IP packet whose destination address is 10.244.1.3 will appear on the CNI0 bridge first, and then be routed to the local flanner1 device for processing. Exit of the "tunnel". The VTEP device of the destination host (that is, the flannel1 device).

When all Nodes are started, we can see the routing information of multiple flannel1 NICs on Node1, because flanneld was created after startup.

[root@node-0 ~]# ifconfig
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.244.0.0  netmask 255.255.255.255  broadcast 0.0.0.0
        ether 8a:bf:bf:7e:b7:f6  txqueuelen 0  (Ethernet)
        RX packets 28929  bytes 1676230 (1.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12085  bytes 42372533 (40.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@node-0 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
...
10.244.1.0      10.244.1.0      255.255.255.0   UG    0      0        0 flannel.1
....

It can be seen from the above figure that 10.244.1.0 is the IP address of the VTEP device (flannel1) of Node2, and the communication between these VTEP devices needs to find a way to form a virtual layer 2 network, that is: through the layer 2 data frame to communicate, and After receiving the original message, the VTEP device on Node1 must find a way to add the original message to the destination MAC address, encapsulate it into a Layer 2 data frame, and then send it to the destination VTEP device. What needs to be solved here is the MAC address of the destination VTEP device?

According to the routing table information, we know the IP address of the destination VTEP device, and querying the layer 2 MAC address based on the layer 3 IP address is the function of the ARP table. The ARP table record is used here, that is, the flanneld process is automatically added to Node1 when the Node2 node starts. as follows:

[root@node-0 ~]# ip neigh show dev flannel.1
10.244.1.0 lladdr b2:ba:aa:a5:10:1a PERMANENT

With this MAC address, the Linux kernel can start layer 2 encapsulation. The MAC address mentioned above has no meaning for the host's layer 2 network, so the above-mentioned encapsulated data frames cannot be transmitted in the host's layer 2 network. For the convenience of overview, we refer to the above data frames as internal data frames. Therefore, the Linux kernel must further encapsulate the internal data frame into a common data frame on the host network, so that he can carry the internal data frame and transmit it through the eth0 network card. This encapsulation is called an external data frame. In order to implement this free-riding mechanism, the Linux kernel adds a special VXLAN header in front of the encapsulation internal data frame to indicate that this passenger is actually a data frame used by VXLAN. There is an important flag VNI in this VXLAN header, which is a flag to identify whether a certain data frame should belong to itself. In flannel, the value of VNI is 1, which is why the host's VTEP device is called flannel1. At this time, the Linux kernel will encapsulate this data frame into a UDP packet and forward it. Although flannel1 of node1 knows the MAC address of flannel2 of node2, but does not know the address of node2MAC, that is, UDP should be sent to that host, in fact flannel1 also plays the role of a bridge, UDP forwarding on the second layer network, and In the Linux kernel, the forwarding database of the bridge device comes from the FDB forwarding database. The FDB information corresponding to this flannel bridge is maintained by the flannel process, and its contents are as follows:

[root@node-0 ~]# bridge fdb show flannel.1  | grep b2:ba:aa:a5:10:1a
b2:ba:aa:a5:10:1a dev flannel.1 dst 172.16.138.41 self permanent

We can see the host with the IP address of 172.16.138.41, obviously this host is Node2, and the purpose of UDP to be forwarded is also found. The next step is the process of host network packets.

Let's take a look at how flanneld is configured when an EventAdded arrives, and how the packets flow in the flannel network.

 

As shown in the figure above, when Host B joins the flannel network, it will write its subnet 10.1.16.0/24 and Public IP 192.168.0.101 to etcd, and it will also write the mac address of the vtep device flannel.1 etcd.

After that, host A will get EventAdded event, and get various information added by host B to etcd. At this time, it will add three pieces of information on the machine:

  • Routing information: All packets to the destination address 10.1.16.0/24 are sent through the vtep device flannel.1, and the gateway address is 10.1.16.0, which is the flannel.1 device in host B.
[root@node-0 ~]# ip route list
...
10.1.16.0/24 via 10.1.16.0 dev flannel.1 onlink
...
  • fdb information: MAC address is the mac address of flannel, and the data packets sent to 10.1.16.0 will be sent to the destination address 192.168.0.101 through vxlan, that is, host B
[root@node-0 bin]#  ip neigh show dev flannel.1
10.1.16.0 lladdr b2:ba:aa:a5:10:1a PERMANENT

[root@node-0 bin]#  bridge fdb show flannel.1  | grep b2:ba:aa:a5:10:1a
b2:ba:aa:a5:10:1a dev flannel.1 dst 192.168.0.101 self permanent
  • arp information: The MAC address of the gateway address 10.1.16.0 is the mac address of flannel
[root@node-0 bin]# arp -v
Address                  HWtype  HWaddress           Flags Mask            Iface
...
10.1.16.0               ether   b2:ba:aa:a5:10:1a   CM                    flannel.1
...

Parameter Description:

  • Network flannel uses a network address in CIDR format (10.244.0.0/16) to configure network functions for pods
  • SubnetLen represents the subnet size allocated to each host, we can specify it during initialization, otherwise use the default configuration. In the case of the default configuration, SubnetLen is configured as 24 (indicating a 24-bit subnet mask).
  • SubnetMin is the smallest assignable subnet in the cluster network address space, which can be specified manually, otherwise the default configuration is the first assignable subnet in the cluster network address space. For example, for "10.1.0.0/16", when SubnetLen is 24, the first assignable subnet is "10.1.1.0/24".
  • SubnetMax represents the largest subnet that can be allocated. For "10.1.0.0/16", when subnetLen is 24, SubnetMax is "10.1.255.0/24"
  • Backend.Type specifies the type of backend used by flannel. There are three types: vxlan, host-gw, and udp. If not specified, the default is "vxlan"
  • Note: When Backend is vxlan, the mac address of the vtep device will be stored in etcd
Published 6 original articles · received 1 · views 655

Guess you like

Origin blog.csdn.net/DY1316434466/article/details/89970622