BGP PIC – Prefix Independent Convergence

Border Gateway Protocol (BGP) is a widely used protocol. However, there has always been an issue with BGP, which is convergence.

Cisco does support PIC on all their routing platforms (IOS, IOS-XE, IOS-XR and NX-OS). The BGP PIC edge and core for the IP and Multiprotocol Label Switching (MPLS) function improves convergence after a network failure. This convergence is applicable to both core and edge failures on IP and MPLS networks.

Normally, BGP can take several seconds to a few minutes to converge after a network change. At a high level, BGP goes through the following process:

  1. BGP learns of failures through either Interior Gateway Protocol (IGP) or Bidirectional Forwarding Detection (BFD) events or interface events.
  2. BGP withdraws the routes from the Routing Information Base (RIB) and the RIB withdraws the routes from the Forwarding Information base (FIB) and distributed FIB (dFIB). This process clears the data path for the affected prefixes.
  3. BGP sends withdraw messages to its neighbours.
  4. BGP calculates the next best path to the affected prefixes.
  5. BGP inserts the next best path for affected prefixes into the RIB and the RIB installs them in the FIB and dFIB.

In the network comprising thousands of iBGP peers exchanging millions of routes, many routes are reachable via more than one path. Given the large scaling targets, it is desirable to restore traffic after failure in a time period that does not depend on the number of BGP prefixes.

IGP (OSPF/ISIS) deals with hundreds of routes Max a few thousand, but only a few hundreds are really important/relevant. BGP is designed to carry millions of routes and a few large customers carry that amount of prefixes!

One of the reasons why BGP is slow is because the router has configured FIB table to associate a BGP prefix directly to an interface. That is what CEF does. CEF’s job is to improve the route lookup time by performing a recursive lookup on the BGP prefix and store the directly connected next-hop in the FIB. This, however, creates an issue when we have a lot of BGP prefixes because if there is a failure and IGP converged, CEF needs to update the connected next hop for ALL BGP prefixes.

The solution is BGP PIC. BGP PIC is a solution that enables the router to update the BGP next-hop on the FIB by using a hierarchical FIB structure. It is a very simple solution. All BGP prefixes that have the same connected next hop are pointed to ONE next-hop. So now instead of charging thousands of connected next-hop, we just change ONE next-hop.

The end result is BGP FIB update time will be the independence of the number of prefixes. Which makes BGP convergence time remains the same regardless of how many prefixes it carries.

BGP PIC

The P and PE failure can be detected by IGP. Tuned OSPF and IS-IS both have converged within 1 second, and both have FRR LFA capability, with enabling a local repair within 50ms.

The PE-CE link is typically not routed in SP IGP, so the convergence is based on MP-BGP. MP-BGP is convergence is slow, plus its increases as the number of prefixes increases.

One of the reasons why BGP is slow is because the router has configured FIB table to associate a BGP prefix directly to an interface. That is what CEF does. CEF’s job is to improve the route lookup time by performing a recursive lookup on the BGP prefix and store the directly connected next-hop in the FIB. This, however, creates an issue when we have a lot of BGP prefixes because if there is a failure and IGP converged, CEF needs to update the connected next hop for ALL BGP prefixes.

The solution is BGP PIC. BGP PIC is a solution that enables the router to update the BGP next-hop on the FIB by using a hierarchical FIB structure. It is a very simple solution. All BGP prefixes that have the same connected next hop are pointed to ONE next-hop. So now instead of charging thousands of connected next-hop, we just change ONE next-hop.

The end result is BGP FIB update time will be the independence of the number of prefixes. Which makes BGP convergence time remains the same regardless of how many prefixes it carries.

What is BGP PIC (BGP FRR) 

The BGP PIC Edge for IP and MPLS-VPN feature improves BGP convergence after a network failure. This convergence is applicable to both core and edge failures and can be used in both IP and MPLS networks. The BGP PIC Edge for IP and MPLS-VPN feature creates and stores a backup/alternate path in the routing information base (RIB), forwarding information base (FIB), and Cisco Express Forwarding and LFIB so that when a failure is detected, the backup/alternate path can immediately take over, thus enabling fast failover.

BGP PIC is essentially BGP equivalent of FRR, plus RIB/FIB/LFIB optimization using hierarchy on the next hop.

Hierarchical FIB – Advantages

• Routing Convergence: BGP PIC Core

 – the BGP dependents converge at IGP convergence of their nhop

 • Scaling and Robustness

  – Smaller FIB Memory
– Much less CPU requirement

 BGP PIC Edge
large

Two ISP’s peer at multiple locations and exchange 350k BGP prefixes.

R3, a typical PE within the grey ISP, installs each of these 350k BGP prefixes as multipath pair entries (BGP nhop B1, BGP nhop B2)

We send traffic from R3 to each of the 350k IPv4 BGP prefixes of the yellow ISP. We measure the loss of connectivity upon a peering node failure (R1)

In 180msec, as a consequence of its IGP convergence, R3 has updated its FIB tree such that all the 350k IPv4 BGP prefixes be forwarded via the alternate path through R2.

The normal BGP convergence will then take place which involves R2 sending 350k of withdraws and R3 running 350k of best paths and RIB modifications. This process will likely take multiple tens of seconds but has no impact on the service thanks to the FIB fix up provided by “BGP PIC” at IGP convergence time

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s