VXLAN Forwarding in ACI
ACI performs L2 and L3 traffic forwarding on VXLAN Overlay . In ACI leaf nodes are called as PTEPs ( Physical Tunnel End Points). But in general leaf switches are called as VTEPs ( VXLAN Tunnel End points).In ACI Layer 2 switched traffic carries a VXLAN Network Identifier (VNID) to identify bridge domains, and Layer 3 (routed) traffic carries a VRF ID in VNID. The encapsulation / decpatulation of VXLAN header is done on VTEP.
The below figure gives an idea about spine and leaf switches , where as leaf switches are actually VTEPS.
VXLAN also allows mapping of location to identity of endspoints. In Cisco ACI, the endpoint’s IP address is the identifier, and a VTEP address designates the location (leaf) where end points are connected .Cisco ACI uses a dedicated VRF and interfaces of the uplinks as the infrastructure to carry VXLAN traffic. The transport infrastructure for VXLAN traffic is known as Overlay-1, which exists as part of tenant Infra.
The Overlay-1 VRF in ACI contains /32 routes to each VTEP, vPC virtual IP address, APIC as well as spine proxy IP address.
TEP IP address
PTEP IP address :- This is the IP address provided by APIC from Infrastructure Subnets as loopback interface , which was configured on APIC initial configuration phase. This address is used for communication with APIC , other Leafs , MP-BGP peering , traceroute or ping.
Proxy TEP IP address :- This is an anycast IP address that is present across all spines and is used for forwarding lookups into the mapping database.
FTEP IP address :- This address is used when VMM domain ( ESXI environment ) is present. A fabric loopback TEP (FTEP) is used to encapsulate traffic in VXLAN to a vSwitch VTEP . It is a unique FTEP address that is identical on all leaf nodes to allow mobility of downstream VTEP devices.
vPC loopback VTEP address :- This IP address is used when the two leaf nodes forward traffic that enters through a vPC port. Traffic is forwarded by the leaf using the VXLAN encapsulation. This address is shared with the vPC peer.
Following are the control-plane protocols running inside the fabric:
- Intermediate Switch–to–Intermediate Switch (IS-IS) protocol runs on the interfaces between leaf and spine to maintain infrastructure reachability.
- Council of Oracles Protocol referenced as (COOP) runs on the loopback address of PTEP to synchronize and it ensures the consistency of the endpoint database or Mapping table on spine switches.COOP defines roles to spine and leaf. All spines are called as Oracle and all leafs are called as Citizens. If any thing is learned by Citizens they will inform to Oracles and if any thing is learned by Oracles , that will be informed to all Oracles.
- MP-BGP also runs on the PTEP loopback and it advertises all external WAN routes throughout the fabric.
- VXLAN tunnels are created between PTEPs of other leaf and spine proxy TEPs.
Each leaf maintains VXLAN tunnels database with all other leaf nodes on Overlay-1. To check you need to consider the inventory of the fabric.
In this example we will see how lead 10.0.16.21 has Overlay tunnel to 10.0.16.22 .
VXLAN Headers for ACI Fabric:
In the ACI fabric, some extensions have been added to the VXLAN header to support following features in ACI :
- Security zones segmentation ( Tenant )
- Management of filtering rules and policies ( Contracts / Filters )
- Enhanced load-balancing techniques
The VXLAN header used in the Cisco ACI fabric is shown below :
When any packet uses VXLAN in ACI then Minimum MTU size that the fabric ports need to support is the original MTU (1500) + 50 bytes.
Original MTU ( 1500) + 14 Bytes ( Frame ) + 20 Bytes ( IP Header ) + 8 Bytes ( UDP) + 8 bytes ( iVXLAN) = 1550 bytes
The Cisco ACI fabric uplinks are configured for 9150 bytes, which is large enough to accommodate the traffic of servers sending jumbo frames.The MTU of the fabric access ports is 9000 bytes, to accommodate servers sending jumbo frames.
Cisco uses some mote bits and spaces in VXLAN header to use it in its ACI infrastructure. In VXLAN header Cisco Uses following more field:
- Source Group: To determine the Source EPG
- P bit called as Policy bit , When its value is set to 0 , Policy is not instantiated on leaf and if its value is 1 then policy is instantiated.