Data Center

VXLAN And EVPN

Introduction”

Why we need a New extension for VLAN ?

  • VLAN uses Spanning Tree protocol for Loop prevention, which ends up not being able to use half of the network by blocking redundant paths. In contrast VXLAN packets are transferred through the underlying network based on its Layer 3  header and take complete advantage of layer 3 routing, ECMP and link aggregation protocols to use all available paths.
  • VLAN running in the DC for number of years but with the rapidly growth of Virtualisation, On-demand VM,increasing customer 4K VLAN is not sufficient.
  • Also because of the limitation of STP such as link/path utilization convergence issues, MAC table size and some of the network links are under utilized.
  • VXLAN is a MAC-in-UDP encapsulation method used for extending a Layer 2 or Layer 3 overlay network over an existing Layer 3 infrastructure.
  • The VXLAN encapsulation provides a VNI that can be used to provide segmentation of Layer 2 and Layer 3 data traffic.
  • To facilitate the discovery of these VNI over the underlay Layer 3 network, virtual tunnel end point are used. VTEP is an entity that terminates the VXLAN tunnel end points.
  • It maps the Layer 2 frames to a VNI to be used in the overlay network. Encapsulating customer layer 2 and layer 3 traffic in VNI over the physical Layer 3 network provides.

Why do you chose EVPN over VPLS ?

– More efficient in the BGP table.
– Control your information more completely, distribution of MAC addresses.
– Much simpler solution then VPLS.
EVPN (Ethernet VPN) is  a next generation VPLS.

VPLS Highlights

  • In VPLS customer MAC addresses are learned through the data plane.
  • Source MAC addresses are recorded based on Source address from both AC (Attachment Circuit) and PW.
  • In VPLS active active flow-based load balancing is not possible.
  • Customer can be dual-homed to the same or different PEs of service provider, but either those links can be used as active/standby for all VLAN or VLAN Based Load Balancing can be achieved.

EVPN Highlights

  • EVPN can support active active flow based load balancing so same VLAN can be used on both PE devices actively.
  • This provides faster convergence in customer link, PE link, or node failure scenarios.
  • Customer MAC addresses are advertised over the MP-BGP control plane. There is no data plane MAC learning over the core network in EVPN.
  • But Customer MAC addresses from the AC are still learned through the data plane.

What is VXLAN”

  • As the name VXLAN (Virtual Extensible LAN) implies, the technology is meant to provide the same service to connected ethernet end systems that VLAN do today, but in more extensible manner.
  • Compared to VLAN, VXLAN are extensible with regards to scale, and extensible with regard to the reach of their deployment.
  • 802.1Q VLAN identifier space is only 12 bits. The VXLAN identifier space is 24 bits. This doubling in size allows the VXLAN ID space to increase over 400000 percent to 16 million unique identifiers.
  • VXLAN uses IP (both unicast and multicast) as the transport medium. The ubiquity of IP networks and equipment allows end to end reach of a VXLAN segment to be extended far beyond the typical reach of VLAN using 802.1Q today.
  • There is no denying that there are another technology that can extend the reach of VLAN but none are ubiquitously deployed as IP.

Terminology:

EVI : An EVPN instance spanning across the PE’s participating in that EVPN.
ESI (Ethernet Segment Identifier): The set of ethernet links attaching a CE to a when CE is multi-homed to two or more PE’s. Ethernet Segment must have a unique non zero identifier, the Ethernet segment identifier.
Ethernet Tag : An ethernet TAG identifies a particular broadcast domain, e.g. a VLAN. An EVPN instance consists of one or more broadcast domain. Ethernet Tag are assigned to the broadcast domains of a given EVPN instance by the provider of that EVPN. Each PE in that EVPN instance performs a mapping between broadcast

VTEP (VLAN Tunnel End Point)

  • VXLAN uses VXLAN tunnel endpoint (VTEP) devices to map tenants end devices to VXLAN segment and to perform VXLAN encapsulation and de-encapsulation.
  • Each VTEP function has two interfaces:
    • One is a switch interface on Local LAN Segment to support local endpoint communication through bridging.
    • IP interface to the transport IP network.
  • The IP interface has a unique IP address that identifies the VTEP device on the transport IP network known as the infrastructure VLAN.
  • The VTEP device uses this IP address to encapsulate ethernet frames and transmits encapsulated packet to transport network through the IP interface.
  • A VTEP device also discovers the remote VTEP for its VXLAN segment and learns remote MAC address to VTEP mappings through the IP interface.

VTEP has two interfaces :

  • Local LAN Interface (Provides a bridging function for local host connected to the VTEP)
  • Ip Interface (The interface on the core network for VxLAN. The IP address on the IP interface helps in uniquely identifying VTEP in the network)

IMAGE 1

Note :

  1. IP intrasubnetwork or non-IP Layer 2 traffic is mapped to a VNI that is set aside for VLAN or bridge domain.
  2. The routed traffic on the other hand is mapped to a VNI that is set aside Layer 3 VRF.
  3. Because of the Layer 3 underlay network, VXLAN is capable of performing ECMP, link aggregation and other L3 functionality.
  4. Spanning tree protocol is not required anymore, there is no more blocked path to make the network under-utilised.
  5. VxLAN provides multi tenant solution where the network traffic is isolated by a tenant and the same VLAN can be used by different tenants.

VXLAN ENCAPSULATION AND PACKET FORMAT :

  • VXLAN packet is nothing more than an MAC-in-UDP encapsulated packets. The VxLAN header is added to the original Layer 2 frame and then placed in a UDP-IP packet.
  • The VxLAN header is an 8 bytes header that consists of 24 bit VxLAN network Identifier (VNID) and few reserved bits.
  • The VNI uniquely identifies the layer 2 segment and helps in maintaining isolation among them. Because the VNID is a 24 bits VxLAN can support 16 million LAH segment.

IMAGE 2

Flags : 8 Bits in length, where the fifth bit (I Flag) is set to 1 indicates valid VNI. The remaining 7 bits (R bits) are reserved fields and are set to zero.
VNI : 24 bits value that provides a unique identifier for the individual VxLAN segment.
OUTER UDP HEADER: The Source port in the outer UDP header is dynamically assigned by the originating VTEP and is calculated based on the hash of inner Layer 2/Layer 3/ Layer 4 headers of the original frame. The destination port assigned UDP port 4789 or customer configured.
SRC PORT : DYNAMICALLY ASSIGNED ORIGINATING VTEP
DST PORT : 4789
OUTER IP HEADERThe Source IP address in the outer IP header is the originating VTEP’s IP interface. The IP address on the IP interface uniquely identifies a VTEP. The destination address of the outer IP header is the IP address of the destination VTEP’s IP interface.
SRC IP : VTEP INTERFACE IP
DST IP : IP ADDRESS of the Destination IP interface
OUTER ETHERNET / MAC HEADERThe Source MAC address is the source VTEP MAC address. The destination MAC address is the next-hop MAC address. The next hop is the interface used to reach the destination or remote VTEP.
SRC MACSRC VTEP ROUTER MAC
DST MAC : DST MAC is the Next Hop interface re reach the destination or remote VTEP.

Screen Shot 2017-08-10 at 12.34.36 AM.png

  • VXLAN is a layer 2 overlay scheme over a layer 3 network.
  • It uses MAC address in UDP encapsulation to provide a means to extend layer 2 segment across the data centre network.
  • VXLAN is a solution to support a flexible, large-scale multi tenant environment over a shared common physical infra.
  • The transport protocol over the physical data center network is IP + UDP.
  • VXLAN defines a MAC-in-UDP encapsulation scheme where the original Layer 2 frame has a VXLAN header added and is then placed in a UDP IP packet.
  • With this MAC-in-UDP encapsulation VXLAN tunnels Layer 2 network over Layer 3 network. The VXLAN packet format.
  • VXLAN introduces a 8 Bytes VXLAN header that consists of 24 bits VNID and few reserved bits.
  • The VXLAN header together with the Original Ethernet Frame goes in UDP Payload.
  • The 24 bit VNID is used to identify Layer 2 segment and to maintain Layer 2 isolation between the segment.

VxLAN GATEWAY TYPES :

Frames encapsulation and decapsulation is performed by the VTEP. A VTEP originates and terminates VxLAN tunnels.

VxLAN gateway traffic between a VxLAN segment and another physical or logical Layer 2 domain (Such as VLAN). There are two kinds of VxLAN Gateways.

Layer 2 Gateway : The layer 2 gateway is required when the layer 2 traffic comes from VxLAN segment (encapsulation) or the egress VxLAN packet egress out an 802.1q tagged interface (decapsulation) where the packet is bridge to a new VLAN.

Layer 3 Gateway : A layer 3 gateway is used when there is a VxLAN to VxLAN routing, that is when the egress VxLAN packet is router to a new VxLAN segment.

                      A Layer 3 gateway is also used when there is a VxLAN to VLAN routing; that is the ingress packet is a VxLAN packet on a routed segment, but the packet egresses out on a tagged 802.1q interface and the packet is routed to a new VLAN.

VxLAN MTU :

  • VxLAN adds 50 bytes to the original Ethernet Frame.
  • VTEP must not fragment the VxLAN Packets
  • Intermediate routers may fragment encapsulated VxLAN packets due to the larger frame size.
  • The destination VTEP may silent discard such VxLAN fragments

To ensure end-to-end traffic delivery without fragmentation it is recommended that the MTU (Maximum Transmission Unit) across the physical network infrastructure be set to a value that accommodates the large frame size due to the encapsulation.

VXLAN Overview”

The VxLAN overlay mechanism requires that the VTEP peer with each other so that the data can be forwarded to the relevant destination.

  • Flood and Learn
  • BGP EVPN
  • Ingress Replication

VxLAN FLOOD AND LEARN Mechanism :

  1. This is a DATA PLANE learning technique for VxLAN, where a VNI is mapped to a multicast group on a VTEP.
  2. There is no control or signaling protocol defined, emulation of multidirectional traffic is handled through the VxLAN IP underlay through the use of segment control multicast group.
  3. The Host traffic is always Broadcast/Unknown Unicast/ Multicast (BUM) Format.
  4. BUM traffic is flooded to the multicast delivery group for the VNI that is sourcing the host packet.
  5. The remote VTEP that are part of the multicast  group learn the remote host MACVNI and source VTEP IP information form the flooded traffic.
  6. The unicast packet to the Host MAC are sent directly to the destination VTEP as a VxLAN packet.

Note : Local MAC are learned over a VLAN (VNI) on a VTEP.

Screen Shot 2017-08-10 at 12.36.18 AM.png

Packet Flow in Flood and Learn :

Step 1 : The End System A with MAC-A and IP A sends an ARP request for host with IP B.
The source MAC address of the ARP packet is MAC-A and the destination MAC address is FF:FF:FF:FF:FF:FF. Suppose the host is in VLAN 10. This packet is sent towards VTEP 1. VTEP 1 has VNID 10 mapped to VLAN 10.

Step 2 : When the ARP request is received at the VTEP-1, the packet is encapsulated and forwarded to  the remote VTEP-2 and VTEP-3 with the source address as 192.168.1.1 and destination as  239.1.1.1 as a VxLAN packet. When the encapsulation is done, the VNID is set to 10, the source MAC of the packet is MAC 1, and the destination MAC is 0001.5E01.0101, which is multicast MAC  address for 239.1.1.1

NOTE : Only those VTEP that have subscribed to that multicast group receives the multicast packet. The multicast group is configured to map to the VNI on each VTEP.

Step 3 : Both the VTEP 2 ad VTEP 3 receive the VxLAN packet and deencapsulated it to forwards it to the End Systems connected to the respective VTEPS.

Note : VTEP 2 and VTEP 3 update their MAC address table with the following information.

MAC address : MAC A
VxLAN ID  : 10
Remote VTEP : 192.168.1.1
Now VTEP 2 and 3 knows the MAC address of MAC-A.

Step 4 : After the ARP packet is forwarded to Host B after deencapsulation, Host B responds back with the ARP reply.

Step 5 : When the ARP reply reaches VTEP-2. VTEP 2 already knows that to reach MAC-A, it needs to go to VTEP-1. Thus VTEP-2 forwards the ARP reply from Host B as a unicast VxLAN packet.

Step 6 : When the VxLAN packet reaches VTEP-1, it then updates its MAC address table with the following information.

MAC Address : MAC B
VxLAN ID : 10
Remote VTEP : 192.168.2.2

Step 7 : After the MAC table is update on VTEP-1, the ARP reply is forwarded to Host-A.

OVERVIEW OF VXLAN BGP EVPN:

  • Flexible Workload placement
  • Reducing flooding in the DC
  • Overlay setup using Control Plane that is independent of specific fabric controller
  • Layer 2 and Layer 3 traffic segmentation
  • The VXLAN Flood and Learn does not meet the requirements.
  • BGP MPLS based Ethernet VPN (EVPN) solution that was developed to meet the limitation of the flood and learn mechanism.
  • In the BGP EVPN solution for VxLAN overlay, a VLAN is mapped to a VNI for the Layer 2 services and a VRF is mapped to VNI for the Layer 3 services on a VTEP.
  • An iBGP EVPN session is established between all the VTEPs or with the EVPN RR to provide the full mesh connectivity required by iBGP peering rules.
  • After the iBGP EVPN session is established, the VTEP exchange MAC-VNI or MAC-IP bindings as part of BGP NLRI update.

DISTRIBUTED ANYCAST GATEWAY:

Distributed anycast gateway refers to the use of any cast gateway addressing and an overlay network to provide a distributed control plane that governs the forwarding facilities of frames within and across a Layer 3 core network.

The distributed any cast gateway functionality facilitates transparent VM mobility and optimal east-west routing by configuring the leaf switches with same gateway IP and MAC address for each locally defined subnet.

The main benefit of using the distributed any cast gateways is that the hosts or VM will use the same default gateway IP and MAC address no matter which leaf they are connected to. Thus all VTEP have the same IP address and MAC address for the Switched Virtual Interface (SVI) in the same VNI.

Within the spine-and-leaf topology, there can be various traffic forwarding combinations. Based on the forwarding types the distributed anycast gateway plays its role in one  of the following manners.

Intra Subnet and Non IP Traffic : For the Host-Host communication that is intrasubnet or non IP, the destination MAC address in the ingress frame is the target end host’s MAC address. Thus the traffic is bridged from VLAN to VNI on the ingress/egress VTEP.

Inter Subnet IP Traffic : For host to host communication that is intersubnet, the destination MAC address in the ingress frame belongs to the default gateway MAC address. Thus the traffic gets routed. But on the egress switch there can be two possible forwarding behaviors-it can either get routed or bridged.

  • If the inner destination MAC address belongs to the end host, then on the egress switch after VxLAN decapsulation the traffic is bridged.
  • On the other hand, if the inner destination MAC address belongs to the egress switch, the traffic is routed.

ARP SUPPRESSION :

  • ARP Request from a host is flooded in the VLAN.
  • It is possible to optimize the flooding behaviour by maintaining an ARP cache locally on the attached VTEP and generating an ARP response from the information available from local cache.
  • This is achieved by using the ARP suppression feature.
  • Using Arp suppression, network flooding due to host learning can be reduced by using G-ARP.
  • Typically a host will send out a G-ARP message when it first comes online. When local VTEP device receives the ARP, it creates an ARP cache entry and advertise to the remote leaf VTEP using BGP Route Type 2. (BGP EVPN MAC route advertisement).
  • The remote leaf node puts the IP-MAC info into the remote ARP cache and surpasses incoming ARP requests to that particular IP.
  • If a VTEP does not have a match for the IP address in its ARP cache table, it floods the ARP request to all other VTEP in the VNI.

Screen Shot 2017-08-10 at 12.37.49 AM.png

Step 1 : Host 1 in VLAN 100 sends an ARP request for Host 2 IP address.
Step 2 : VTEP 1 on Leaf-1 intercepts the ARP request. Rather than forwarding it towards                  the core,it checks ARP suppression cache table. If finds a match for Host2 IP                          address in VLAN 100 in its ARP suppression cache. It is imp to note that the BUM                traffic is sent to other VTEPS.
Step 3 : VTEP 1 sends the ARP response back to Host-1 with the MAC address of Host-2,                    thus reducing the ARP flooding in the core network.
Step 4 : Host 1 gets IP and MAC mapping for Host 2 and update the ARP cache.

Integrated Routed and Bridge Mode :

EVPN draft defines two integrated routing and bridging (IRB) mechanism :

Asymmetric IRB
Symmetric IRB

1. Asymmetric IRB :

In this method it is required the source VTEP with both the source and destination VNI for both Layer 2 and Layer 3 forwarding. Asymmetric IRB uses different paths from the Source to the Destination and back.

Screen Shot 2017-08-10 at 12.38.43 AM.png

Step 1 : Host 1 in VNI A sends a packet towards Host 2 with the source MAC address of Host 1 and the destination MAC address set to gateway MAC address set to gateway MAC.

Step 2 : The ingress VTEP routes the packets from the source VNI to the destination VNI; that is, if the source packet was received in VNI-A the packet is routed to the destination VTEP VNI-B. When the packet is sent, the source MAC of the inner packet is set to gateway MAC and the destination MAC as the Host 2 MAC address.

Step 3 : When the packet reaches the destination VTEP, the egress VTEP bridges the packets in the destination VNI.

Step 4 : The return packet also follow the same process.

Because the ingress VTEP device needs to be configured with both the source and destination VNI, this create a scalability problem, because all the VTEP devices are required to be configured with all VNI in the network so that they can learn about all the other hosts attached to those VNI.

PACKET FLOW 

2. Symmetric IRB :

  • The symmetric IRB is more scalable and preferred option.
  • The VTEP is not required to be configured with all the VNI.
  • The symmetric IRB used the same path from the source to the destination and on the way back as well.
  • In this method the ingress VTEP routes packets form source VNI to L3 VNI where the destination MAC address in the inner header is rewritten to egress VTEP router MAC address.
  • On the egress side, the egress VTEP decapsulated the packet and looks at the inner packet header. Since the destination MAC address of the inner header is its own router MAC address, it performs the Layer 3 routing lookup.
  • Because the layer 3 VNI (in the VxLAN) provide the VRF context lookup, the packet are routed to the destination VNI and VLAN.

Screen Shot 2017-08-10 at 12.39.28 AM.png

Step 1 : Host 1 in VNI A sends a packet towards VNI B with the source MAC address of Host 1 and the destination MAC address set to gateway MAC address set to gateway MAC.

Step 2 : Ingress VTEP routes packets form source VNI to L3 VNI where the destination MAC address in the inner header is rewritten to egress VTEP router MAC address.

Step 3 : On the egress side, the egress VTEP decapsulated the packet and looks at the inner packet header. Since the destination MAC address of the inner header is its own router MAC address, it performs the Layer 3 routing lookup.

Step 4 : Because the layer 3 VNI (in the VxLAN) provide the VRF context lookup, the packet are routed to the destination VNI and VLAN.

How VxLAN works ?

  • The VXLAN draft defines the VXLAN Tunnel End Point (VTEP) which contains all the functionality needed to provide ethernet Layer 2 services to connected end systems.
  • VTEP are intended to be at the edge of the network, typically connecting an access switch (Virtual or physical ) to an IP transport network.  It is expected that the VTEP functionality would be built into the access switch, but it is logically separated from the the access switch.
  • Each end system connected to the same access switch communicate through the access switch. The access switch acts like any learning bridge does, by flooding out its ports when it doesn’t know the destination MAC or sending out a single port when it has learned which direction leads to the end station as determined by source MAC learning.
  • Broadcast traffic is sent out all ports.
  • Further the access switch can support multiple “bridge domain” which are typically identified as VXLAN with as associated VLAN ID that is carried in the 802.1Q header on trunk port. In case of VXLAN enabled switch, the bridge domain would instead by associated with a VXLAN ID.

Screen Shot 2017-08-10 at 12.40.36 AM.png

  • Each VXLAN has two interfaces : One is a bridge domain trunk port to the access switch, and the other is an IP interface to the IP network.
  • The VTEP behaves as an IP host to the IP network. It is configured with an IP address based on the subnet its O/P interface is connected to. The VTEP uses this IP interface to exchange IP packets carrying the encapsulated Ethernet Frame with other VTEPs.
  • A VTEP also acts as an IP host by using the IGMP to join IP multicast group.
  • In addition to a VXLAN ID to be carried over the IP interface between VTEP, each VXLAN is associated with an IP multicast group. The IP multicast group is used as communication bus between each VTEP to carry broadcast, multicast and unknown unicast frames to every VTEP participating in the VXLAN.

Screen Shot 2017-08-10 at 12.41.42 AM.png

  • The VTEP also works the same way as a learning bridge, in that if it doesn’t know where a given destination MAC is, it floods the frame, but it performs this flooding function by sending the frame to the VXLAN associated multicast group.
  •  Learning is similar except of learning the source interface associated with a frame source MAC, it learns the encapsulation source IP address. Once it has learned this MAC to remote IP associated, frames can be encapsulated within a unicast IP packet directly to the destination VTEP.
  • The initial use case for VXLAN enabled access switches are for access switches connected to the end system VM. These SW are tightly integrated with the hypervisor.
  • One benefit of this tight integration is that the virtual access switch knows exactly when a VM connect to or disconnect from the switch, and what VXLAN the VM is connected to, using this information, the VTEP can decide when to join or leave a VXLAN multicast group. When the first VM connects to a given VXLAN the VTEP joins the multicast group and start receiving broadcast /multicast/ floods over that group.
  • Similarly  when the last VM connected to a VXLAN disconnect, the VTEP can see IGMP leave multicast group and stop receiving traffic for the VXLAN which has no local receiver.

Note : Because of the potential number of VXLAN (16M) could exceed the amount of multicast state supported by IP network multiple VXLAN could potentially map to the same IP multicast group.

While this could result in VXLAN traffic being sent needlessly to a VTEP that has no need systems connected to that VXLAN, inter VXLAN traffic isolation is still maintained.

The same VXLAN ID is carried in multicast encapsulated packets as is carried in unicast encapsulated packets. It is not the IP network job to keep the traffic to the end system isolated, but the VTEP. Only the VTEP inserts and interprets/ removes the VXLAN header within the IP/UDP payload. The IP network simply sees IP packets carrying UDP traffic with a well known dest UDP port.

INTRODUCTION TO MP-BGP (EVPN)”

  • Ethernet VPN introduces the concept of BGP MAC routing.
  • It uses MP-BGP for learning MAC addresses between provider edges.
  • Learning between PE and CE is still done in the data plane.
  • The BGP Control Plane has the advantage of scalability and flexibility for MAC routing, just as it does for IP routing.
  • EVPN provides separation between the data plane and control plane, which allows it to use different encapsulation mechanism in the data plane while maintaining the same control plane.
  • IANA has allocated EVPN a new NLRI with an AFI of 25 and SAFI of 70.
  • EVPN/PBB-EVPN introduces four new BGP route types and communities.
  • Various components are involved as part of BGP EVPN control Plane; these work together to implement the VxLAN functionality using control plane learning and discovery mechanism.
  • MP BGP plays an important role with the VxLAN BGP EVPN feature. The router distribution is carried out via MP-iBGP update message in the L2VPN EVPN family.
  • Generally MP-BGP (EVPN) uses route type 2 to advertise MAC and MAC+IP information of the hosts and router type 3 to carry the VTEP information.
  • The BGP EVPN overlay specifies the distribution and discovery of VTEP using EVPN. The information is carried as EVPN Inclusive Multicast (IM) NLRI.

Screen Shot 2017-08-10 at 12.42.53 AM.png

Encoding of the IM NLRI is based on “Single Virtual Identifier per EVI” Whereas the VPNID is mapped to a unique Ethernet VPN instance (EVI).

RD : Route Distingusher for the EVPN instance
Ethernet Tag ID : VNI for the Bridge Domain
IP address Length : 1 Byte
Originating Routers IP address : VTEP IP address of the advertising endpoint.

Advertisement and learning of IP host address associated with a VTEP is accomplished via BGP EVPN MAC advertisement NLRI.

The VTEP information is implicitly sent as the BGP Next hop associated with the IP host and by also providing the VTEP gateway MAC address in the MAC advertisement NLRI.

Screen Shot 2017-08-10 at 12.44.10 AM.png

  • The RT value is manually configured or auto generated which is based on a 2 Bytes AS Number and the VNI value.
  • The route is imported into the correct VLAN or bridge domain based on the import route target configuration.
  • The design for the VxLAN deployment follows the spine and lead architecture.

BGP Route Type”

TYPE: 1

ROUTE TYPE: Ethernet Auto-Discovery Route
USAGE: MAC MASS Withdraw, Aliasing, Advertising Split Horizon Labels.
BGP Community : ESI MPLS Label Extended Community

In case of a Multi homed CE device.

Route Type 1 : Ethernet Auto Discovery Routes

  • Ethernet Auto-Discovery (A-D) routes are type 1 mandatory routes and are used for achieving split horizon, fast convergence and aliasing.
  • Only EVPN uses Type 1 routesPBB-EVPN uses B Mac to achieve the same function.
  • Multi-homed PE advertises an auto discovery route per Ethernet Segment with the newly introduced ESI MPLS label extended community.
  • PE recognise other PE connected to the same Ethernet segment after the Type 4 E-S route exchange.
  • All the multi homed and remote PE routers that are part of the EVI will import the auto discovery route.
  • All the multi-homed and remote PE routers that are part of the EVI will import the auto discovery route.
  • The Ethernet AD route is not needed when ESI=0.

Example : When CE is single Homed. The ESI label extended community has a eight bit flag, which indicates “Single-Active or “All-Active” redundancy mode.

Split Horizon :

  • CE1 sends a BUM frame to a non-DF PE, lets say PE1 will forward the traffic to all other PE in the EVPN instance including the DF PE. PE2 in this example.
  • PE2 must drop the packet and cannot forward it to CE1. This is referred to as Spilt Horizon.
  • The ESI label is distributed by all the PE operating in A-S and A-A mode using the Ethernet A-D route as ES. Ethernet A-D routes are imported by all PE that are participating in the EVPN instance.

Screen Shot 2017-08-10 at 12.45.08 AM.png

Route Type 2 : MAC advertisement Route:

TYPE: 2
ROUTE TYPE: MAC Advertisement Route
USAGE: Advertising MAC Address reachability, Advertise IP/MAC bindings
BGP Community : MAC Mobility extended community, Default Gateway Extended Community

This is responsible for MAC advertisement routes, which are responsible for advertising MAC address reachability via MP-BGP to all other PE in a given EVPN instance. MAC advertisement routes are type 2 routes.

Here Learning the PE-CE is in the Data Plane, once PE1 learns the MAC of CE1. It advertises it to other PE’s through the BGP NLRI using MAC advertisement route contains RD, ESI (Which coul doe zero or non zero value for multi homes cases), MAC address,  NPS  label associated with MAC and the IP address field which is optional.

Screen Shot 2017-08-10 at 12.45.41 AM.png

Per EVI Label Assignment : This is similar to Per VRF label allocation mode in IP world. A PE advertises single EVPN label for all the MAC addresses in a given EVI instance.
Obviously this is the most conservative way of allocating labels and the tradeoff is similar to per-VRF label assignment. This method required an additional lookup on the egress PE.

Per MAC Address Label Assignment : This is similar to per-prefix label allocation mode in IP. A PE advertise unique EVPN labels for every MAC address This is the most liberal way of allocating labels and the tradeoff is memory consumption and the possibility of running out label space.

Route Type 3 : Inclusive Multicast Route

TYPE: 3
ROUTE TYPE: Inclusive Multicast Route
USAGE: Multicast Tunnel End point discovery

  • When sending  BUM frames, PE can use ingress replication, P2MP or MP2MP (mLDP) LSP.
  • Every participating in an EVI will advertises its mcast labels during its startup seq via inclusive Multicast Routes.
  • Inclusive Multicast routes are BGP route type 3.
  • Once a PE has received mcast routes from all the other PE and a BUM frames arrives the PE will do ingress replication by attaching PE Mcast label.

Screen Shot 2017-08-10 at 12.46.23 AM.png

  • In the above details, PE1 label 16006 and PE3 Label 16001 advertise their multicast label to PE3.
  • When PE2 receives a broadcast packet it adds, the mcast label 16001 + the label to reach PE3 and send the packet to PE3.
  • PE2 also forwards the packet to PE2 by adding the ESI label + Label 16001 + label to reach PE1.
  • PE3 receives the packet and sees the mcast label, it treat the packet as a BUM frame. When PE1 receive the packet, it notice the ESI label which was advertised as part of Ethernet AD route and drops the packet.

Route Type 4 : Ethernet Segment Route

TYPE: 4
ROUTE TYPE: Ethernet Segment Route
USAGE: Redundancy group discovery, DF election
BGP Community : ES-Import extended community

  • In case of multi homed CE device, a set of ethernet links comprises an Ethernet Segment. A unique ethernet segment identifier (ESI) number identifies this ethernet segment, which can be manually configured or automatically derived.
  • When a single homed CE is attached to an Ethernet segment, the ESI value is zero.
  • Route (BGP Route Type 4) with newly introduced ES-import extended community (=ESI value) along with the extended community.
  • All the PE automatically imports the route if their ESI value matches ESI Import community.
  • This process is also referred to as auto-discovery and allows PE connected to the same ethernet segment to auto discover each other.

Screen Shot 2017-08-10 at 12.46.59 AM.png

  • PE2 and PE1 have the same EVI value (ES=1); PE1 advertises its ESI value in the ethernet segment route with ES-Import Community set to ES1.
  • PE2 and PE3 receives the route but only PE2 will import this route, since it has a Matching ESI value.
  • This ensure PE2 knows that PE1 is connected to the same CE device.
  • After Auto Discovery the Designated Forwarder (DF) election happens for Multi homes CE.
  • The PE which assumes the roles of DF, is responsible for forwarding BUM frames on a given segment to CE.
  • The DF election happens by the PE first building an ordered list of IP addresses of all the PE nodes in ascending order.

For example :

PE1 : 1.1.1.1
PE2 : 2.2.2.2

Position PE
    PE1 1.1.1.1
1     PE2 2.2.2.2

Ethernet TAG Value Ethernet TAG ID
300 0
301 1

PE1 becomes DF for Ethernet tag 300 and PE2 becomes DF for Ethernet Tag 301

Route Type 5 : IP PREFIX ADVERTISEMENT IN EVPN

TYPE: 5
ROUTE TYPE: IP PREFIX ADVERTISEMENT IN EVPN

  • Its a mechanism to carry IPv4 and IPv6 advertisement in EVPN only networks.
  • While EVPN type 2 route allow to carry both MAC addresses and IP addresses, tight coupling of specific IP address with IP prefixes might not be desirable of the draft discusses different scenarios where such coupling is nor desirable.

Screen Shot 2017-08-10 at 12.47.55 AM.png

GW IP ADDRESS : Will be 32 or 128 bit field and will encode an overlay IP index for the IP prefix. The GW IP field should Zero it it is not used as an overlay index.

MPLS LABEL : The MPLS label field is encoded as 3 octet where the high order 20 contain the label value. This should be null when the IP prefix route used for recursive lookup resolution.

Prefix Advertisement draft introduces the concept of “overlay index”. When an overlay index is present in the Route Type 5 advertisement the receiving NVE PE will need to perform a recursive route resolution to find out to which egress NVE (PE) to forward the packet.

The route will contain a single overlay index at most. If the ESI field is different from Zero.

Reference : https://tools.ietf.org/html/draft-ietf-bess-evpn-prefix-advertisement-05#page-7

VxLAN configuration in EVE-NG : Configuring VxLAN

Advertisements

Categories: Data Center, Switching, Vxlan

2 replies »

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s