vPC – Part III

Posted by
Virtual Port Channel Series: We are going to have detailed discussion on vPC in this series of post. Below are the topics which are covered in different posts.
  1.  Basics of vPC : Virtual Port Channel (vPC)- Part 1
  2. vPC Inconsistency and Control Plane: vPC – Part II
  3. vPC Failure Scenarios
  4. vPC with HSRP : vPC with HSRP – Part IV
  5. vPC Design Variations vPC Design Variations

Till now, We have learned the concepts of vPC and how the vPC works. We also saw the functionality of control plane alongwith some key features.

Today, we are going to discuss various Failure scenarios and how vPC is impacted by them.

  1. vPC Peer Link Failure
  2. vPC KeepAlive Link Failure
  3. Peer and KeepAlive Link Failure
  4. Primary Peer Switch Failure
  5. Primary Switch and Keepalive link Failure
  6. Primary and Secondary both Switch Failure

Failure Scenarios:

1. vPC peer link failure:

When vPC encounters a peer link failure,  following sequence of events happen:

  • Peer Status will be changed to “Peer Link is down” on both the vPC switches.
  • As the Peer Keepalive link is up, both the switches will know that their peer is alive.
  • So they will retain their vPC role and won’t take on the active role, hence we will not be put into Split  Brain/Dual Active Situation.
  • Peer link failure means loss to East-West traffic , so to minimize this loss secondary peer suspends its  Member ports except Orphan ports.
  • This will prevent the duplication of frames and loops in the network.
  • Unfortunately, this will blackhole the traffic for orphan ports.

Lets shut down the peer link and see that even though the peer link is down, vPC is still operational.

f1

On Secondary Peer, notice that vPC member ports are down with reason “Peer-link is down

f2

So the updated topology is as below:

peerli

What if a new port is added when peer-link has already failed, will it come up???

So lets revise the order of operations and it does have consistency check as one of the check. The failed consistency  check will keep the new ports down as well and they will be brought up, once Peer Link outage is restored.

2. vPC Keepalive Link Failure:

  • When vPC keepalive Link Fails and peer link is still up, both the switches are still receiving BPDUs from each  other.
  • So they will retain their vPC roles and this will not impact the overall functionality of the vPC.

vPC status once the keepalive link is made shut:

Primary:

f4

Secondary:

f5

3. Peer and Keepalive link failure:

When both the links fail, each switch will assume that peer is dead and take on the operational primary role.  Hence we will end up in a Active-Active Scenario. Both the switches will now forward traffic and can form  Layer 2 loops.

f8

This sort of situation require a manual intervention for recovery.

Primary acting as Primary:

f6

Secondary also acting as operational Primary:

f7

Though it sounds rare, but poor network design may bring both the Peer and Keepalive links to fail at the same time. This will  happen under below circumstances:

  • Both the links share the same Module and we have a Module failure on the switch.
  • Both the links are on different Modules but are connected to Peer Switch via a common Layer 2 device  and we have a failure on that common device.

As a best practice, bundled links and keepalive link should not share the same fate. There should be  redundancy in place in case failure occurs.

In case keepalive link is on SVI then “dual-active exclude interface-vlan” command can be used to keep the  SVI up in case of link failure.

4. Primary Peer Switch failure

Suppose we had a power Outage and primary switch is powered off. The secondary switch will  consider the peer to be down as both the peer and keepalive links went down. Once three keepalives are missed, secondary will take over the role of Primary and start forwarding  the traffic.

f12

When the primary switch comes up, it will resume the operational secondary role as the vPC role is  non-preemptive. This is because preemption will incur a traffic loss , which is not acceptable.

So if you come across a output where role is “Secondary, operational primary” , this indicates that this  is result of past failure.

5.  Primary Switch and Peer Link Failure

Think of a situation when first we had a peer link failure , secondary will shut down its member ports.  Primary was forwarding the traffic and suddenly it also fails.  The secondary will stop getting heartbeats and  will suspect that primary has failed. When three keepalives are missed, secondary will unshut the ports and  assume the primary role.

peerli

f11

As keepalives are sent every second, so there will be traffic disruption of around 4-5 seconds. This can be  minimized by setting the keepalive interval to lower value using below command:

f9

6. Power outage on both vPC peer switches

If there is a power outage and both the switches go down, then vPC will be completely down causing a  complete outage. Once the power is restored, if only one of the switch comes up then keepalives will  not be heard, hence peer link will not come up. This will also not allow the member ports to come up.

So even if one of the switch is restored, we are still experiencing a complete isolation. “Auto recovery”  is an option used to overcome this failure situation.

f10

This will allow the switch to assume primary role and start forwarding the traffic in case peer does not  come up.

In our coming vPC series, we will discuss about vPC flavors , Data Plane forwarding and HSRP with vPC.

vPC Series:

Virtual Port Channel (vPC)- Part 1
vPC – Part II

Advertisements

5 comments

  1. You could definitely see your expertise within the paintings you write. The sector hopes for more passionate writers like you who aren’t afraid to say how they believe. All the time go after your heart.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s