VxLAN principle

VxLAN
 Background:
  was raised from the last century virtualization technology, but technology can not reach due to hardware, but can not pay attention, since the beginning of this century hardware manufacturing techniques become more and more strong, resulting in a lot ran only a single physical machine an application or applications can not use the full capabilities of the hardware, resulting in a lot of waste of resources, virtualization also pushed at this time to the cusp, the first use of virtualization technology comes as a data center, but it soon discovered questions:
  1. VLAN number and the plight of
    the number of VLAN serious shortage, leading to the more than 4096 tenants, they can not divide VLAN, and then how to isolate each tenant virtual machines that?
   In fact the traditional VLAN isolation through the network, there are a lot of inconvenience, such as not dynamically adjust network expansion difficult, even worse is when the physical machine appears
   to do VM migration event of failure or overloading is very inconvenient, how to solve this?

  2. Layer 2 network boundary limits
    under VM virtualization in order to allow easy migration of Layer 2 network continues to expand, this story brings more and more access devices, STP more and more problems,
   the most direct is the large number of ports because the anti-ring is blocked, there has been access to aggregate bandwidth between a quarter decline, brought together the core of the bandwidth dropped to between 1/8,
      closer to the roots of the switch port blocking more serious, resulting in an overall bandwidth of serious waste of resources, inefficient and increases costs, this is a problem in
   a part of! [STP (Spanning Tree Protocol) is mainly to avoid the floor of the exchange switches in a network loop occurs resulting in "Cyber Storm";]
      followed by the switcher's MAC address cache in an emergency, but the VM on the physical machine is still growing, new MAC unable to cache, resulting in
      the switch can only flooding broadcast on all interfaces, resulting in the entire network is flooded with a large number of broadcast, seriously affecting traffic, how to do?

  3. The management dilemma
   How big story under the effective management of physical, improve network utilization, while taking into account the upper virtualized applications?

  In this context, VMware and Cisco proposed VxLAN (Virtual eXtensible LAN) technology, followed by HP and Microsoft also made NVGRE (Network Virtual GRE), the main features of these two technologies is the beginning and end of the tunnel in the main vSwitch upper, rather than the physical switch, the package has a tunnel to lay on the inside vSwitch server, entering the physical network, IP packets are IP layer on a physical interface, which becomes frame originally Layer It became a Layer 3 IP packets can be forwarded directly routing, thus achieving a large Layer 2 transparent transmission, even across DC (data center). Both techniques have their advantages and disadvantages, the use of a very VxLAN four mature VxLAN UDP to carry data thereof, such that the packet header becomes larger, the corresponding data area can only continue to decrease, while the NVGRE is a GRE (generic routing encapsulation, generic routing encapsulation) do the upgrade version, the original GRE protocol 24 is used as the low-tenant network identifier (TNI), to extend the VLAN, and it VxLAN the VNI as will VLAN extended to the 24 th power of 2, but the biggest drawback is NVGRE not support copy traditional balanced, because four load balancing based on the need to mobilize IP and port, while the GRE header before four this leads to four load balancing can not be resolved GRE head, four ports will not be able to obtain information, making it impossible for its load balancing. That's why "Overlay" network was born, Overlay can be translated as overlay that overlay network. Note: Overlay network, more than that, and VMware private STT, GRE and so on.


Traditional VxLAN
  VxLAN is the essence to achieve a tunneling protocol that does not depend on the cloud, when two physically only through Layer 3 communication network, if you want to achieve when at Layer 2, VxLAN is an alternative solution that is similar to VPN (virtual private network) VPN is only done for data security and secure encryption, VxLAN not need this, it is only through the original protocol driver VxLAN Ethernet frame is encapsulated into VxLAN then from four (UDP) is restarted package is sent to the network, the outside the packet is a normal IP packet, can be directly linear or routing forwarding, called direct, that is, the target is the LAN host, routing is required router to forward packets.

VTEP (VXLAN Tunnel End Point):
  Internet often interpreted as packing and unpacking components VxLAN, but I think this explanation confusing protocol drivers and the user sees VxLAN interface, so in this explanation, VTEP understood as a virtual VxLAN tunnel interfaces, when we want their application packets go VxLAN tunnel, easy to operate and control who we VxLAN tunnel peer Yes. But the real application data to the package VxLAN operating system calls VxLAN protocol driver implementation.

  

     In this figure, vxlan0 interfaces can be considered a VTEP, when vm1 to access vm2 Talia is to communicate, we must rely on VxLAN-Tunnel to achieve. Then how vm1 and vm2 configured to use vxlan0 that?
  It is not necessary to consider the fact that, because the virtual interface on the second floor and p0 are vxlan0 Ovs-vSwitch, they are understood to be common switch interface, the interface is a two-switch when the switch want to send a data packet onto the network, it is to get the other side of the MAC broadcast, and then forwarded directly to the second floor, vm1 to access vm2, when vm1 initiated the visit, when a packet arrives vSwitch, vSwitch query MAC address table, no MAC 10.0.0.2 of , then flooded on all ports ARP broadcast, the ARP vxlan0 received a broadcast packet, the interface will receive the data vxlan0 VxLAN sent to the drive, the drive VxLAN according to configuration information of vxlan0 to specify the destination of the tunnel vxlan is 192.168.10.23, then the entire driving start VxLAN Layer ARP frames as it transmits the data packet encapsulated VxLAN, then call TCP / IP protocol stack, the packets are encapsulated into UDP VxLAN, then proceeds to the IP encapsulation, the chain after the completion of path layer, packets are written to the transmit buffer, the final core will occur by the DMA chip data in the buffer is sent to the card, and then proceeds to the physical network packet, the object Network packet is like this:

  

  You can see, the source IP physical network packets seen: 192.168.10.15, target IP: 192.168.10.23, this is the story directly forwarding, call or expand large two-story, in this way is also known as the three layer gateways way, when vm1 access vm2, for vm2, its VxLAN gateway is vxlan0 on Server1. Then we take a look at, VxLAN extension of VLAN, which is also known as two-story VxLAN gateway, practical and triple, are relative.

  

  我们很清楚,VLAN之间是互相隔离的,p0被分到VLAN101了,vxlan0接口在默认VLAN0中,它俩如何通信,但是事实上vxlan0接口会被自动配置为tunck口,所以vxlan0接口依然可以收到vm1发的二层广播报文,下图就是VxLAN承载二层VLAN的包结构,也就是所谓的VLAN扩展。

  

   以上拓扑结构若你想测试,可参考下面的OpenVSwitch的配置,这是VMware环境下测试:

  1. 安装OpenVSwitch
    yum install openvswitch

  2. 启动OpenVSwitch服务
    systemctl start openvswitch

  3. 创建vSwtich
    ovs-vsctl add-br br-vxlan #br-vxlan就是vSwitch的名字.此名可任意.

  4. 创建VxLAN虚拟网口vxlan0
    #在Server0上配置vxlan0时这样写:
      ovs-vsctl add-port br-vxlan tun0 -- set interface tun0 type=vxlan options:remote_ip=192.168.10.23
    #创建了一个VxLAN类型的接口tun0
    #set interface实际上修改了vSwitch配置数据库中的interface表.
    #OpenVSwitch2.0.0时只能这样看:
      ovsdb-client dump Open_vSwitch #将默认Open_vSwitch数据库中的所有数据都dump出来看.

    #OpenVSwitch2.5.x后可这样单独看Interface表
      ovsdb-client dump Open_vSwitch Interface

    #在Server1上配置vxlan0是这样写:
      ovs-vsctl add-port br-vxlan tun0 -- set interface tun0 type=vxlan options:remote_ip=192.168.10.15

    #若想测试gre,可修改type=gre,或使用下ip link来创建gre隧道.


  查看配置信息:
    ovs-vsctl show

  5. 创建虚拟网口p0
    ovs-vsctl add-port br-vxlan p0 -- set interface p0 type=internal

  6. 创建一个网络名称空间,来充当一个简单的虚拟机
    ip netns add vm1

  #接着将拉一根虚拟网线将vm1与vSwitch br-vxlan的p0口连接起来.
  #但要注意: p0实际上是被关联到vm1中了,vm1内部默认只有一个lo0接口,
  # 我们需要将一个宿主机上可见的网口摘掉,安装到vm1中。
  # 关于网络名称空间,可参考: 容器原理
    ip link set p0 netns vm1

  #接着对安装到vm1中的p0网口,配置IP地址
    ip netns exec vm1 ifconfig p0 10.0.0.1/24 up

  #查看配置信息:
    ip netns exec vm1 ifconfig

  7. 接着就可以在VMware Network Adapter VMnet1上抓包来看了。
   #这时在Server0上的vm1中:
    ip netns exec vm1 ping 10.0.0.2

   #在VMnet1上抓包,此时就可以查看VxLAN三层网关的效果了。

  8. 接着来看看二层网关的效果:
    ovs-ofctl show br-vxlan

    

    #在流表中创建一个Action Set(动作集):【注意: 可先在Server0或Server1上配置,然后ping,在抓包看】
    ovs-ofctl add-flow br-vxlan "priority=10,in_port=2,dl_vlan=0xffff,actions=mod_vlan_vid:101,normal"
  #简单说明:
  # priority:指定规则优先级,数字越大越优先,默认为32768,范围:0~65535, 建议设置,避免出现违背我们意愿的行为发生。
     in_port: 指定对从那个接口进入vSwitch上的流量做匹配.
     vlan: 0xffff 表示匹配所有vlan号,若需精确匹配vlan号,只需指定具有的VLAN号即可,如: vlan=101
     actions: 指定匹配前面两个条件后,执行的动作集合。
   mod_vlan_vid:101 这是修改VLANID为101
     normal 使数据包服从设备正常的L2/L3处理。(并非所有OpenFlow开关都实现此操作。)

  ovs-ofctl add-flow br-vxlan "priority=8,in_port=3,dl_vlan=101,actions=strip_vlan,output:3"
    #dl_vlan: 指定匹配所有数据包的VLAN ID
    #strip_vlan: 去掉VLAN层
    #output:3 将匹配的流量从编号为2的接口上送出。

 以上两条流规则就实现对vm1或vm2上出来的流量添加VLAN ID101,并且当收到对方回应时,将VLAN ID 101的VLAN层去掉,还原为正常IP包,发给vm1或vm2.

  下图就是VxLAN所监听的端口:

    

   #另外,若你想看看,vxlan0上的数据包是什么样的,可这样做:
  OpenVSwitch2.0.0以后这样配置:
  [root@s23 ~]# ovs-vsctl -- set Bridge br-vxlan mirrors=@my1 \
            -- --id=@m1 get port tun0 -- --id=@a1 get port p1 \
            -- --id=@my1 create mirror name=t1 select-dst-port=@m1 select-src-port=@m1 output-port=@a1
    #说明:
    # 上面实际是配置了一个端口SPAN.
    # SPAN:Switched Port Analyzer(交换端口分析器),这是一种交换机端口镜像技术,
        目的是将某些端口进入和出去的流量都,复制一份到指定端口 做协议分析.
        注:端口监控分为两种:本地SPAN 和 远程SPAN(RSPAN),
       RSPAN:通常采用VLAN来实现将跨交换机的复制端口流量.
    #上面配置的是本地SPAN.
    # OVS2.0.0以后,修改了配置方式
    # 1. 先在指定vSwitch上创建一个镜像ID
        2. 将想监控 或 监控口的流量复制到那个接口 都定义一个引用ID
        3. 创建镜像,需要使用name定义镜像名, select-dst-port和select-src-port
        设置要监控那些接口,需要指定引用ID,可指定多个,用逗号分隔。
        若只需要进入此接口的流量就指定dst,若只需要出去的流量就指定src
        4. output-port设置将监控口的流量复制到那个接口上.
      【注意: p1是又添加到端口,可自行参考前面创建】

  [root@s23 ~]# ovs-vsctl list mirror
    _uuid : 85df4437-c903-4110-9061-1d54d36c0427
    external_ids : {}
    name : "t1"
    output_port : 206ace18-b3e6-4423-9216-5088a60b82c1
    output_vlan : []
    select_all : false
    select_dst_port : [8e1dd937-dd97-4446-89c8-af304ff960a5]
    select_src_port : [8e1dd937-dd97-4446-89c8-af304ff960a5]
    select_vlan : []
    statistics : {tx_bytes=0, tx_packets=0}


下图是网上找到,画的非常好的VxLAN通信流程:
  需要注意:
  第二步:当配置了多条VxLAN隧道时,因为初始化时,没有对方的MAC, VNI,和IP的对应表。因此VxLAN会使用默认的239.1.1.1这个组播地址,对所有VxLAN 隧道接口发送这个报文。
  第三步:VxLAN收到组播包后,VTEP-3 和 VTEP-2都会收到,它们都会先学习一个MAC,VNI,IP到对应条目。接着它俩将VxLAN头部去掉,还原为原始ARP广播报文后,转发到自己所在的LAN中.

  

最后来说说云之下的VxLAN
  VxLAN本身仅是一个隧道协议,只是因为它能够解决叠加网络的问题,所以就将其引入到网络虚拟化中,成为SDN(软件定义网络)中的一种功能实现。SDN是一种网络虚拟化的整体概念框架,它不是具体的技术,它是一种理念,一种指定如何使用软件来完成对物理网络的虚拟化的理论。
  在SDN这种理念下,目前比较知名的API实现是OpenFlow,我在OpenFlow和SDN原理中有较为详细的说明,由于本人能力有限也无法深入刨析,只能尽可能让每位认真学习的道友能明白其中大概的原理。由于SDN理念之巨大,OpenFlow也并非是完整的将整个TCP/IP协议栈都抽象化成API的,这点需要知道。在OpenFlow的流行之下,目前已经有很多基于OpenFlow协议实现的软件:OpenVSwitch就是其中之一。
  下面两种图是摘自: https://www.cnblogs.com/sammyliu/p/4627230.html
  它就描述了SDN的理念,下面我简单做一些说明:
    首先SDN理念是将物理网络抽象化,将其分为控制平面和数据平面
    OpenFlow将其更具体化了,它将物理网络转为了一台PC的架构,即分为硬件层,内核空间 和 用户空间
    其中对于用户空间来说,它就是控制平面,内核空间就是数据平面(数据的提供者)
    而从内核空间来看,硬件层是数据平面(实际存储策略数据,并执行动作者),而内核空间是控制平面(控制策略下发者)

  有了以上概念,就可以继续说明下图了
  对于Service Node来说它是实际转发策略的制定者,也就说它是控制平面,而具体的vxlan驱动是实际运行在硬件中的执行者,可将其理解为一个仅有vxlan功能的物理设备,即数据平面。当VM1发送数据到vxlan101接口上时,vxlan驱动收到了用户的数据流,但它发现自己不知道如何转发流量,因为初始化时,VNI,VTEP,MAC对应关系表没有,因此它就会将该数据包完整转发给控制器(控制平面),然后控制器负责做出转发决策,若此时控制器查询数据库发现自己也没有VNI,VTEP,MAC对应关系表的相关条目,于是下发策略让vxlan101(VTEP)将请求通过组播239.1.1.1组播出去,这样隧道接口上的加入相同组播组的接口都会收到,并作出相应动作,假如这个VxLAN包中的实际数据是ARP查询,当目标收到后,它会回应ARP响应后,当vxlan101(3.3.3.3)收到响应后,它会将其转发给控制平面,控制器将会根据响应学习一条VNI,VTEP,MAC对应关系的条目,并再次告诉数据平面要如何转发,这样一个来回后,对于vxlan101(3.3.3.3)这个实际执行者来说,它就缓存了一条转发策略,下次vm1在发来数据流,它就可以直接执行转发,而无需将数据包转发给控制器了。
  下图仅做参考.

  

  

 

Guess you like

Origin www.cnblogs.com/wn1m/p/VxLAN.html