Establishment of a VXLAN in Centralized Gateway Mode Using BGP EVPN

During the establishment of a VXLAN in centralized gateway mode using BGP EVPN, the control plane process includes:

VXLAN tunnel establishment
Dynamic MAC address learning

The forwarding plane process includes:

Intra-subnet forwarding of known unicast packets
Intra-subnet forwarding of BUM packets
Inter-subnet packet forwarding

This mode uses EVPN to automatically discover VTEPs and dynamically establish VXLAN tunnels, providing high flexibility and is applicable to large-scale VXLAN networking scenarios. It is recommended for establishing VXLANs with centralized gateways.

The following uses an IPv4 over IPv4 network as an example. Table 1 shows the implementation differences between IPv4 over IPv4 networks and other combinations of underlay and overlay networks.

**Table 1** Implementation differences
Combination Type	Implementation Difference
IPv6 over IPv4	During dynamic MAC address learning, the Layer 2 gateway learns the local host's MAC address through neighbor discovery. Hosts at both ends learn each other's MAC address by exchanging Neighbor Solicitation (NS)/Neighbor Advertisement (NA) packets. In the inter-subnet interworking scenario, an IPv6 address must be configured for the Layer 3 gateway's VBDIF interface. During inter-subnet packet forwarding, the Layer 3 gateway needs to search its IPv6 routing table for the next hop address of the destination IPv6 address, query the ND table based on the next hop address, and then obtain information such as the destination MAC address.
IPv4 over IPv6	A BGP EVPN IPv6 peer relationship is established between gateways. The VTEP IP addresses are IPv6 addresses.
IPv6 over IPv6	A BGP EVPN IPv6 peer relationship is established between gateways. The VTEP IP addresses are IPv6 addresses. During dynamic MAC address learning, the Layer 2 gateway learns the local host's MAC address through neighbor discovery. Hosts at both ends learn each other's MAC address by exchanging NS/NA packets. In the inter-subnet interworking scenario, an IPv6 address must be configured for the Layer 3 gateway's VBDIF interface. During inter-subnet packet forwarding, the Layer 3 gateway needs to search its IPv6 routing table for the next hop address of the destination IPv6 address, query the ND table based on the next hop address, and then obtain information such as the destination MAC address.

VXLAN Tunnel Establishment

A VXLAN tunnel is identified by a pair of VTEP IP addresses. During VXLAN tunnel establishment, the local and remote VTEPs attempt to obtain IP addresses of each other. A VXLAN tunnel can be established if the IP addresses obtained are routable at Layer 3. When BGP EVPN is used to dynamically establish a VXLAN tunnel, the local and remote VTEPs first establish a BGP EVPN peer relationship and then exchange BGP EVPN routes to transmit VNIs and VTEP IP addresses.

As shown in Figure 1, two hosts connect to Leaf1, one host connects to Leaf2, and a Layer 3 gateway is deployed on the spine node. A VXLAN tunnel needs to be established between Leaf1 and Leaf2 to implement communication between Host3 and Host2. To implement communication between Host1 and Host2, a VXLAN tunnel needs to be established between Leaf1 and Spine and between Spine and Leaf2. Though Host1 and Host3 both connect to Leaf1, they belong to different subnets and need to communicate through the Layer 3 gateway deployed on Spine. Therefore, a VXLAN tunnel needs to be created between Leaf1 and Spine.

A VXLAN tunnel is determined by a pair of VTEP IP addresses. When a local VTEP receives the same remote VTEP IP address repeatedly, only one VXLAN tunnel can be established, but packets are encapsulated with different VNIs before being forwarded through the tunnel.

Figure 1 VXLAN tunnel networking

The following example illustrates how to dynamically establish a VXLAN tunnel using BGP EVPN between Leaf1 and Leaf2 on the network shown in Figure 2.

Figure 2 Dynamic VXLAN tunnel establishment

First, a BGP EVPN peer relationship is established between Leaf1 and Leaf2. Then, Layer 2 broadcast domains are created on Leaf1 and Leaf2, and VNIs are bound to the Layer 2 broadcast domains. Next, an EVPN instance is configured in each Layer 2 broadcast domain, and an RD, export VPN target (ERT), and import VPN target (IRT) are configured for the EVPN instance. After the local VTEP IP address is configured on Leaf1 and Leaf2, they generate a BGP EVPN route and send it to each other. The BGP EVPN route carries the local EVPN instance's ERT, Next_Hop attribute, and an inclusive multicast route (Type 3 route defined in BGP EVPN). Figure 3 shows the format of an inclusive multicast route, which comprises a prefix and a PMSI attribute. VTEP IP addresses are stored in the Originating Router's IP Address field in the inclusive multicast route prefix, and VNIs are stored in the MPLS Label field in the PMSI attribute. The VTEP IP address is also included in the Next_Hop attribute.

Figure 3 Format of an inclusive multicast route
After Leaf1 and Leaf2 receive a BGP EVPN route from each other, they match the ERT of the route against the IRT of the local EVPN instance. If a match is found, the route is accepted. If no match is found, the route is discarded. Leaf1 and Leaf2 obtain the peer VTEP IP address (from the Next_Hop attribute) and VNI carried in the route. If the peer VTEP IP address is reachable at Layer 3, they establish a VXLAN tunnel to the peer end. Moreover, the local end creates a VNI-based ingress replication table and adds the peer VTEP IP address to the table for forwarding BUM packets.

The process of dynamically establishing VXLAN tunnels between Leaf1 and Spine and between Leaf2 and Spine using BGP EVPN is similar to the preceding process.

A VPN target is an extended community attribute of BGP. An EVPN instance can have the IRT and ERT configured. The local EVPN instance's ERT must match the remote EVPN instance's IRT for EVPN route advertisement. If not, VXLAN tunnels cannot be dynamically established. If only one end can successfully accept the BGP EVPN route, this end can establish a VXLAN tunnel to the other end, but cannot exchange data packets with the other end. The other end drops packets after confirming that there is no VXLAN tunnel to the end that has sent these packets.

For details about VPN targets, see Basic BGP/MPLS IP VPN.

Dynamic MAC Address Learning

VXLAN supports dynamic MAC address learning to allow communication between tenants. MAC address entries are dynamically created and do not need to be manually maintained, greatly reducing maintenance workload. The following example illustrates dynamic MAC address learning for intra-subnet communication of hosts on the network shown in Figure 4.

Figure 4 Dynamic MAC address learning

Host3 sends dynamic ARP packets when it first communicates with Leaf1. Leaf1 learns the MAC address of Host3 and the mapping between the BDID and packet inbound interface (that is, the physical interface Port 1 corresponding to the Layer 2 sub-interface), and generates a MAC address entry about Host3 in the local MAC address table, with the outbound interface being Port 1. Leaf1 generates a BGP EVPN route based on the ARP entry of Host3 and sends it to Leaf2. The BGP EVPN route carries the local EVPN instance's ERT, Next_Hop attribute, and a Type 2 route (MAC/IP route) defined in BGP EVPN. The Next_Hop attribute carries the local VTEP's IP address. The MAC Address Length and MAC Address fields identify Host3's MAC address. The Layer 2 VNI is stored in the MPLS Label1 field. Figure 5 shows the format of a MAC route or an IP route.

Figure 5 Format of a MAC/IP route
After receiving the BGP EVPN route from Leaf1, Leaf2 matches the ERT of the EVPN instance carried in the route against the IRT of the local EVPN instance. If a match is found, the route is accepted. If no match is found, the route is discarded. After accepting the route, Leaf2 obtains the MAC address of Host3 and the mapping between the BDID and the VTEP IP address (Next_Hop attribute) of Leaf1, and generates the MAC address entry of the Host3 in the local MAC address table. The recursion to the outbound interface needs to be performed based on the next hop, and the final recursion result is the VXLAN tunnel destined for Leaf1.

Leaf1 learns the MAC route of Host2 in a similar process.

When hosts on different subnets communicate with each other, only the hosts and Layer 3 gateway need to dynamically learn MAC addresses from each other. This process is similar to the preceding process.
Leaf nodes can learn the MAC addresses of hosts during data forwarding, depending on their capabilities to learn MAC addresses from data packets. If VXLAN tunnels are established using BGP EVPN, leaf nodes can dynamically learn the MAC addresses of hosts through BGP EVPN routes, rather than during data forwarding.

Intra-subnet Forwarding of Known Unicast Packets

Intra-subnet known unicast packets are forwarded only between Layer 2 VXLAN gateways and are unknown to Layer 3 VXLAN gateways. Figure 6 shows the forwarding process of known unicast packets.

Figure 6 Intra-subnet forwarding of known unicast packets

After Leaf1 receives a packet from Host3, it determines the Layer 2 broadcast domain of the packet based on the access interface and VLAN information, and searches for the outbound interface and encapsulation information in the broadcast domain.
Leaf1's VTEP performs VXLAN encapsulation based on the obtained encapsulation information and forwards the packet through the outbound interface obtained.
After the VTEP on Leaf2 receives the VXLAN packet, it checks the UDP destination port number, source and destination IP addresses, and VNI of the packet to determine the packet validity. Leaf2 obtains the Layer 2 broadcast domain based on the VNI and performs VXLAN decapsulation to obtain the inner Layer 2 packet.
Leaf2 obtains the destination MAC address of the inner Layer 2 packet, adds a VLAN tag to the packet based on the outbound interface and encapsulation information in the local MAC address table, and forwards the packet to Host2.

Host2 sends packets to Host3 in the same process.

Intra-subnet Forwarding of BUM Packets

Intra-subnet BUM packets are forwarded only between Layer 2 VXLAN gateways, and are unknown to Layer 3 VXLAN gateways. Intra-subnet BUM packets can be forwarded in ingress replication mode. In this mode, when a BUM packet enters a VXLAN tunnel, the access-side VTEP performs VXLAN encapsulation, and then forwards the packet to all egress VTEPs that are in the ingress replication list. When the BUM packet leaves the VXLAN tunnel, the egress VTEP decapsulates the packet. Figure 7 shows the forwarding process of BUM packets.

Figure 7 Intra-subnet forwarding of BUM packets in ingress replication mode

After Leaf1 receives a packet from TerminalA, it determines the Layer 2 broadcast domain of the packet based on the access interface and VLAN information in the packet.
Leaf1's VTEP obtains the ingress replication list for the VNI, replicates the packet based on the list, and performs VXLAN encapsulation. Leaf1 then forwards the VXLAN packet through the outbound interface.
After the VTEP on Leaf2 or Leaf3 receives the VXLAN packet, it checks the UDP destination port number, source and destination IP addresses, and VNI of the packet to determine the packet validity. Leaf2 or Leaf3 obtains the Layer 2 broadcast domain based on the VNI and performs VXLAN decapsulation to obtain the inner Layer 2 packet.
Leaf2 or Leaf3 checks the destination MAC address of the inner Layer 2 packet and finds it a BUM MAC address. Therefore, Leaf2 or Leaf3 broadcasts the packet onto the network connected to terminals (not the VXLAN tunnel side) in the Layer 2 broadcast domain. Specifically, Leaf2 or Leaf3 finds the outbound interfaces and encapsulation information not related to the VXLAN tunnel, adds VLAN tags to the packet, and forwards the packet to TerminalB or TerminalC.

The forwarding process of a response packet from TerminalB/TerminalC to TerminalA is similar to the intra-subnet forwarding process of known unicast packets.

Inter-subnet Packet Forwarding

Inter-subnet packets must be forwarded through a Layer 3 gateway. Figure 8 shows the inter-subnet packet forwarding process in centralized VXLAN gateway scenarios.

Figure 8 Inter-subnet packet forwarding
Click to enlarge

After Leaf1 receives a packet from Host1, it determines the Layer 2 broadcast domain of the packet based on the access interface and VLAN in the packet, and searches for the outbound interface and encapsulation information in the Layer 2 broadcast domain.
The VTEP on Leaf1 performs VXLAN tunnel encapsulation based on the outbound interface and encapsulation information, and forwards the packet to Spine.
Spine decapsulates the received VXLAN packet, finds that the destination MAC address in the inner packet is MAC3 of the Layer 3 gateway interface VBDIF10, and determines that the packet needs to be forwarded at Layer 3.
Spine removes the Ethernet header of the inner packet and parses the destination IP address. It then searches the routing table based on the destination IP address to obtain the next hop address, and searches ARP entries based on the next hop to obtain the destination MAC address, VXLAN tunnel outbound interface, and VNI.
Spine re-encapsulates the VXLAN packet and forwards it to Leaf2. The source MAC address in the Ethernet header of the inner packet is MAC4 of the Layer 3 gateway interface VBDIF20.
After the VTEP on Leaf2 receives the VXLAN packet, it checks the UDP destination port number, source and destination IP addresses, and VNI of the packet to determine the packet validity. The VTEP then obtains the Layer 2 broadcast domain based on the VNI, decapsulates the packet to obtain the inner Layer 2 packet, and searches for the outbound interface and encapsulation information in the corresponding Layer 2 broadcast domain.
Leaf2 adds a VLAN tag to the packet based on the outbound interface and encapsulation information, and forwards the packet to Host2.

Host2 sends packets to Host1 through a similar process.