EMAIL SUPPORT

dclessons@dclessons.com

LOCATION

NZ

NSX Logical Switch Packet Flow

NSX Logical Switch Packet Flow

Posted on Jan 17, 2020 (0)

NSX Logical Switch Packet Flow

Logical Switch Packet Walk:

For each packet walk, it uses Universal Logical switch 5555 as broadcast domain. Below topology is used for each sections of packet walk. Each ESXi cluster has three ESXi host and each ESXi host has two VM powered ON.

  • Cluster 1 VXLAN encapsulation will be on VLAN 10 in DC X
  • Cluster 2 VXLAN encapsulation will be on VLAN 20 in DC X
  • Cluster 3 VXLAN encapsulation will be on VLAN 30 in DC Y

IP addressing of each ESXi host and it’s connected VM is shown and presented well in dig.

Example 1: Logical Switch packet Walk

In this example, let’s assume that C1-M1 is sending frame to VM C1-M2 and assume the following to be true before packet walk:

  • C1-M1 and C1-M2 are powered ON and connected to Universal logical switch 5555.
  • C1-M1 and C1-M2 are using MAC address from their respective vmx file
  • Logical switch 5555 is configured for MAC address learning
  • NSX Universal Controller NC-2 has been given responsibility for VNI 5555
  • C1-M1 knows the MAC address of C1-M2

Step1: C1-M1 will send frame with source IP C1-M1-IP with destination IP is C1-M2-IP, it will use Source MAC of C1-M1-MAC, and destination MAC of C1-M2-MAC.

Step 2. Logical Switch 5555 in ESXi host C1-H1 will receive the frame from C1-M1 VM and capture the source MAC address, C1-M1-MAC.

Step 3. As source MAC address C1-M1-MAC is the same MAC address present in the vmx file of C1-M1, and it is already present in the MAC table of logical switch 5555 of C1-H1 the logical switch will now check for destination MAC address of the frame.

Step 4. Now destination MAC address C1-M2-MAC is the same MAC address present in the vmx file of VM C1-M2, and the MAC address is already in logical switch 5555 MAC table in C1-H1

Step 5. Logical switch 5555 in C1-H1 delivers the frame to C1-M2.

Example 2: Logical Switch packet Walk

In this example, let’s assume that C1-M1 is sending frame to VM C1-M2 and assume the following to be true before packet walk:

  • C1-M1 and C1-M2 are powered ON and connected to Universal logical switch 5555.
  • C1-M1 and C1-M2 are using MAC address not from their respective vmx file
  • Logical switch 5555 is configured for MAC address learning
  • NSX Universal Controller NC-2 has been given responsibility for VNI 5555
  • C1-M1 knows the MAC address of C1-M2

Step 1. C1-M1 will send a frame with the source IP of C1-M1-IP, and destination IP of C1-M2-IP, It will use Source MAC of C1-M1-MAC, along with destination MAC of C1-M2-MAC.

Step 2. Logical Switch 5555 in ESXi host C1-H1 receives the frame from C1-M1 VM and will capture the source MAC address, C1-M1-MAC.

  1. If the MAC address is not in present in its MAC table, logical switch 5555 in C1-H1 will add this Source MAC information in its MAC table and will also inform to the NSX Controller NC-2 if theReplication Mode configured for the logical switch is Unicast or Hybrid.
  2. If the MAC address is present in the MAC table of logical switch 5555 in C1-H1 but it find that it belongs to a different virtual machine in C1-H1 host, it will update its MAC table and not inform to NC-2.
  3. If the MAC address is present in the MAC table of logical switch 5555 in C1-H1 but it finds that it belong to a different virtual machine in a VTEP but different from C1-H1, it will update its MAC address table and will inform to NC-2 if the Replication Mode for the logical switch is configured Unicast or Hybrid.

In all above each case a copy of the MAC address will also be sent to the Switch Security module. If C1-M1 is using an 802.1Q tab, the VLAN number will also be provided to the Switch Security module; else the VLAN number provided to the Switch Security module is 0.

Step 3. Logical switch 5555 in C1-H1 captures the destination MAC address C1-M2-MAC from Source Packet.

Now if the destination MAC address is not present in the MAC table of logical switch 5555 in C1-H1, the logical switch 5555 will sent query to NC-2 for the destination MAC address if the Replication Mode for the logical switch is configured as Unicast or Hybrid. If C1-H1 host does not receive any response from NC-2, or if NC-2 is down, or if the Replication Mode is configured as Multicast, the logical switch will replicate the frame.

In this case, since the MAC C1-M2-MAC is local to C1-M2, it is expected and true that NC-2 will not have an entry for this MAC address.

Step 4. After following step 3, logical switch 5555 in C1-H1 forwards the frame to C1-M2.

  • After receipt C1-M2 replies back to C1-M1 with source MAC address C1-M2-MAC, logical switch 5555 in C1-H1 will learn this MAC address, as explained in step 2a.

Example 3: Logical Switch packet Walk

In this example Virtual Machine C1-M3 sends a frame to Virtual Machine C2-M4. Now let’s assume the following to be true:

  • Both C1-M3 and C2-M4 are connected to Universal Logical Switch 5555.
  • C1-M3 and C2-M4 are using its MAC addresses from their vmx files.
  • Logical switch 5555 is configured with MAC learning.
  • NSX Universal Controller NC-2 is responsible for VNI 5555.
  • C1-M3 also knows the MAC address of C2-M4 VM.
  • C1-M3 and C2-M4 have already communicated with each other recently (around 200 seconds).

Step 1. C1-M3 sends a frame with the source IP C1-M3-IP, destination IP of C2-M4-IP, Source MAC of C1-M3-MAC, and destination MAC of C2-M4-MAC.

Step 2. Logical switch 5555 in ESXi host C1-H2 receives the frame from C1-M3 and reads the source MAC address, C1-M3-MAC.

The source MAC address C1-M3-MAC is the same MAC address in the vmx file of C1-M3; it is already known by logical switch 5555 in C1-H2.

Step 3. Logical switch 5555 in C1-H2 reads the destination MAC address. The destination MAC address C2-M4-MAC is in the MAC table because it has recently seen traffic coming from C2-M4.

The MAC table of logical switch 5555 in C1-H2 has the following entries

  • VNI: 5555
  • Inner MAC: C2-M4-MAC
  • Outer MAC: FFFFFFFFF
  • Outer IP: C2-H2-VTEP

Here Outer MAC is FFFFFFF which states that use DG MAC for sending traffic.

Step 4. Logical switch 5555 passes the frame from C1-M3 to the VXLAN module to create a VXLAN encapsulation.

Step 5. The VTEP in C1-H2 encapsulates the frame using the following information. Note that a new Frame Check Sum (FCS) replaces the FCS from C1-M3.

  • VNI: 5555
  • Source UDP Port: Derived from the frame sent by C1-M3
  • Destination UDP Port: 8472 (the default port, this can be changed with an NSX API call)
  • Source IP: C1-H2-VTEP

If C1-H2 had multiple VTEPs, the VXLAN module would have used the IP of the VXLAN VMkernel port (VTEP) to which C1-M3 was pinned.

  • Destination IP: C2-H2-VTEP
  • Source MAC: C1-H2-MAC

If C1-H2 had multiple VTEPs, the VXLAN module would have used the MAC of the VXLAN VMkernel port (VTEP) to which C1-M3 was pinned.

  • Destination MAC: C1-DG-MAC

An Outer MAC of all Fs means the destination MAC will be the MAC of the default gateway in the VXLAN TCP/IP Stack.

  • 802.1Q VLAN: 10

Step 6. The underlay switch C1-SW receives the VXLAN frame, examines the VXLAN Layer 2 header, and forwards it to the default router over interface C1-DG.

C1-SW conducts regular Ethernet switch processing, such as MAC learning and CoS enforcement, on the VXLAN Layer 2 header.

Note:  The interface from C1-H2 connecting to C1-SW is configured as a Trunk allowing VLAN 10. If the default gateway interfaces are not configured as Trunk, thus their switch ports are set up as access ports. Switch C1-SW removes the VLAN tag from the VXLAN frame before sending it to the default gateway.

Step 7. The default gateway receives the frame over interface C1-DG, processes the VXLAN Layer 3 header, does CoS enforcement, and routes the packet over interface C2-DG.

If the default gateway is executing a firewall function, it may also inspect the VXLAN Layer 4 header. The VXLAN Layer 2 header is changed by the default gateway to include these new values,

  • Source MAC: C2-DG-MAC
  • Destination MAC: C2-H2-MAC

Step 8. The overlay switch C2-SW receives the VXLAN frame from the default gateway, examines the VXLAN Layer 2 header, and forwards it to C2-H2.

Note: C2-DG is connected to an access port in switch C2-SW in VLAN 20. All frames that arrive from interface C2-DG are placed in VLAN 20. The interface C2-H2 connecting to C2-SW is configured as a Trunk allowing VLAN 20.

Step 9. C2-H2 receives the frame over VXLAN VMkernel port MAC address, C2-H2-MAC.

The VXLAN module in C2-H2 reads the VNI, 5555, in the VXLAN frame, decapsulates the VXLAN frame, and passes the frame from C1-M3 to logical switch 5555 in C2-H2 for processing.

Step 10. Logical switch 5555 in C2-H2 reads the source MAC address, C1-M3-MAC.

Logical switch 5555 in C2-H2 already has MAC address C1-M3-MAC in its MAC table because it has recently seen traffic coming from C1-M3.

Step 11. Logical switch 5555 in C2-H2 then reads the destination MAC address, C2-M4-MAC, sees that it knows which virtual machine owns it, and passes the frame to Virtual Machine C2-M4.

Logical switch 5555 in C2-H2 knows MAC address C2-M4-MAC because C2-M4-MAC is in the vmx file of C2-M4.

Example 4: Logical Switch packet Walk

Virtual Machine C2-M5 wants to communicate with Virtual Machine C3-M3. Assume the following to be true:

  • C2-M5 and C3-M3 are connected to Universal Logical Switch 5555.
  • C2-M5 and C3-M3 are using the MAC addresses in their vmx files.
  • Logical Switch 5555 is configured with MAC learning.
  • NSX Universal Controller NC-2 is responsible for VNI 5555.
  • C2-M5 knows the IP address of C3-M3 but not the MAC address.

Step 1. Virtual Machine C2-M5 sends an ARP request with the sender IP C2-M5-IP, target IP of C3-M3-IP, Source MAC of C2-M5-MAC, destination MAC of all Fs, Ethernet broadcast, and an Ethertype of 0X0806 (ARP Request),

Step 2. The Switch Security module in ESXi host C2-H3 inspects the frame after realizing it is an ARP request and checks its ARP table for VNI 5555.

If the Switch Security module in C2-H3 has an entry for the ARP request in its ARP table, it will directly respond to C2-M5 and that would be the end of this packet walk. Instead, let’s assume the Switch Security module does not have an entry in its ARP table for the ARP request.

Step 3. The Switch Security module C2-H3 sends a request to NSX Controller NC-2 for the ARP entry in the ARP table.

If NC-2 has an entry, it will reply back to the Switch Security module in C2-H3 with the entry. The Switch Security module in C2-H3 will add the entry to its ARP table, and directly respond to C2-M5. Again this would be the end of our packet walk. Instead, let’s assume that either:

  • NC-2 does not have an ARP entry for our ARP request and responds back to C2-H3 with FFFF.FFFF.FFFF, which translates in English to “I don’t have an entry for IP C3-M3-IP.”
  • NC-2 is down or unresponsive.
  • Replication Mode is set to Multicast.

Step 4. The Switch Security module ESXi host C2-H3 hands the frame to the logical switch 5555.

Step 5. Logical switch 5555 in ESXi host C2-H3 forwards a copy of the ARP request to all local virtual machines, except C2-M5.

Step 6. Logical switch 5555 in C2-H3 hands the frame to the VXLAN module to replicate the ARP request.

Note: If not using Multicast Replication Mode, the VTEP consults its copy of the VTEP table to determine where to send the replicated frames. In our case, where all VMs are powered on, all ESXi hosts are in the VTEP table.

  • If using Multicast Replication Mode, a single VXLAN frame is sent out by C2-H3 with a destination IP of the multicast group assigned to VNI 5555 as shown below, In this example, all ESXi hosts will have joined the multicast group and thus receive the multicast frame.

  • If using Unicast Replication Mode, ESXi host C2-H3 sends out unicast VXLAN frames, one each to C2-H1 and C2-H2 in the local VXLAN subnet, and one each to the proxy VTEPs C1-H1 and C3-H1, as shown below The proxy VTEPs are locally chosen by C2-H3 per remote VTEP subnet. We are going to assume that C2-H3 chose as proxy VTEPs C1-H1 and C3-H1.

The unicast to the proxy VTEPs will have its Replication bit set to 1, Because C1-H1 and C3-H1 are UTEPs, each one in turn sends unicast VXLAN frames to their local VTEPs, with the Replication bit set to 0.

C1-H1 sends the unicast VXLAN frame to C1-H2 and C1-H3.

  • If using Hybrid Replication Mode, a single multicast VXLAN frame is sent out by C2-H3 with a destination IP of the multicast group assigned to VNI 5555 and the TTL set to 1, and two unicast VXLAN frames are sent out by C2-H3, one to each proxy VTEP C1-H1 and C3-H1.

Step 7. All VTEPs that receive the replicated VXLAN frame have to process it, read the VNI, 5555, in the VXLAN frame, decapsulate the frame, and broadcast the ARP request to all running VMs in logical switch 5555.

In the process of doing this, all logical switches with a VNI of 5555 in every ESXi host learn that MAC address C2-M5-MAC is on C1-H3-VTEP and add it to their MAC tables, and set the dead timer to about 200 seconds.

Step 8. C3-M3 receives the ARP request and responds with an ARP reply.The ARP reply has a destination MAC address of C2-M5-MAC.

Step 9. The Switch Security module in C3-H2 inspects the frame from C3-M3, realizes it is an ARP reply, and adds the entry to its ARP table.

Step 10. Because C3-M3 is running in C3-H2, the Switch Security module in 5555 in C3-H2 sends an IP report to NC-2 with the new ARP entry so it can also add it to its ARP table.

Step 11. The Switch Security module passes on the ARP reply to logical switch 5555 in C3-H2.

Step 12. Logical switch 5555 in C3-H2 reads the destination MAC address of the ARP reply, C2-M5-MAC, looks in its MAC table, and finds an entry for it pointing to C2-H3-VTEP.

Reread step 7 above if you don’t quite see why the entry is in the MAC table.

Step 13. Logical switch 5555 in C3-H2 passes the frame to the VXLAN module for VXLAN frame creation.

This would be a unicast VXLAN frame with a destination IP of C2-H3-VTEP and destination MAC of C3-DG-MAC.

Step 14. C2-H3 receives the VXLAN frame, processes the frame by reading the VNI number, 5555, and decapsulates it.

Step 15. Logical switch 5555 reads the source MAC address of the ARP reply and adds it to its MAC table pointing towards C3-H2-VTEP, with a dead timer of about 200 seconds.

Step 16. Logical switch 5555 then reads the destination MAC address of the ARP reply, C2-M5-MAC, looks it up in the ARP table, and forwards the frame to C2-M5.

Step 17. The Switch Security module in C2-H3 intercepts the frame before it reaches in C2-M5, notices it is an ARP reply, and adds the entry to its ARP table. Then the frame is forwarded to C2-M5.

Example 5: Logical Switch packet Walk

C2-M5 will vMotion to ESXi host C2-H1.

Step 1. The vSphere administrator, or DRS, initiates a vMotion for Virtual Machine C2-M5.

Note:  This is where the Switch Security module plays a role of informing the vMotion destination host, C2-H1, about the MAC addresses that C2-M5 has.

Step 2. When vMotion is completed, ESXi host C2-H3 updates the NSX Controller that it no longer has the MAC address of C2-M5-MAC as shown below

If C2-M5 was the last powered on VM in logical switch 5555 in host C2-H3, C2-H3 will also send the NSX Controller a request to remove its VTEP, C2-H3-VTEP, from the 5555’s VTEP table. At this point, the NSX Controller would update the VTEP table, removing C2-H3-VTEP, and send a copy of the updated VTEP table to all other hosts that have a VTEP in the VTEP table.

Step 3. ESXi host C2-H1 updates the NSX Controller that it has the MAC address C2-M5-MAC, as shown in below figure , as well as all other MAC addresses associated with C2-M5 (which the Switch Security module in C2-H3 told C2-H1 about).

Step 4. Host C2-H1 sends a RARP on behalf of C2-M5, for all MAC addresses associated with C2-M5.

The RARP is used to update the MAC table of switches. 

Step 5. Following the Replication Mode configured in the logical switch, the RARP is replicated to all ESXi hosts in the VTEP table or belonging to the multicast group for VNI 5555.

Review step 6 in the Logical Switch Packet Walk Example 4.

Step 6. All ESXi hosts receiving the RARP add an entry in their local MAC table for VNI 5555 for MAC C2-M5-MAC, including the local MAC table on host C2-H3, the vMotion source host.

The MAC entry is added in VNI 5555’s MAC tables of each logical switch in the ESXi host, with a dead timer of five minutes.


Comment

    You are will be the first.

LEAVE A COMMENT

Please login here to comment.