OSPF

OSPF Troubleshooting

Friends, today in this blog I’ll be sharing an approach to troubleshoot one of the most commonly used interior gateway protocol (IGP) protocol OSPF across all domains.

It routes IP packets based solely on the destination IP address and IP Type of Service found in the IP packet header.  IP packets are routed “as is” — they are not encapsulated in any further protocol headers as they transit the Autonomous System.

OSPF is a dynamic routing protocol. It quickly detects topological changes in the AS (such as router interface failures) and calculates new loop-free routes after a period of convergence.  This period of convergence is short and involves a minimum of routing traffic.

The way in which OSPF routers address OSPF packets varies with the OSPF network type.

  • Broadcast NetworksFor broadcast networks, OSPF routers use the following two reserved IP multicast addresses:
  1. 224.0.0.5 – AllSPFRouters: Used to send OSPF messages to all OSPF routers on the same network. The AllSPFRouters address is used for Hello packets. The DR and BDR use this address to send Link State Update and Link State Acknowledgment packets.
  2. 2240.0.6 – AllDRouters: Used to send OSPF messages to all OSPF DRs (the DR and the BDR) on the same network. All OSPF routers except the DR use this address when sending Link State Update and Link State Acknowledgment packets to the DR.
  • Point-to-Point NetworksPoint-to-Point networks use the AllSPFRouters address (224.0.0.5) for all OSPF messages.
  • NBMA NetworksNBMA networks have no multicasting capability. Therefore, the destination IP address of any Hello or Link State packets is the unicast IP address of a specific neighbor. The neighbor IP address is a required part of OSPF configuration for NBMA network links.

 Configure OSPF

router ospf 1
router-id 1.1.1.1
network 0.0.0.0 255.255.255.255 area 0

Let’s discuss detailed approach on OSPF Troubleshooting:

 OSPF Neighbor States: Use “Show IP OSPF Neighbor” to check the ospf status,

  1. Down:No information has been received from anybody on the segment.
  2. Attempt:On non-broadcast multi-access clouds, this state indicates that no recent information has been received from the neighbor. An effort should be made to contact the neighbor by sending Hello packets at the reduced rate PollInterval.
  3. Init:The interface has detected a Hello packet coming from a neighbor but bi-directional communication has not yet been established.
  4. Two-way:There is bi-directional communication with a neighbor. The router has seen itself in the Hello packets coming from a neighbor. At the end of this stage the DR and BDR election would have been done. At the end of the 2way stage, routers will decide whether to proceed in building an adjacency or not. The decision is based on whether one of the routers is a DR or BDR or the link is a point-to-point or a virtual link.
  5. Exstart:Routers are trying to establish the initial sequence number that is going to be used in the information exchange packets. The sequence number insures that routers always get the most recent information. One router will become the primary and the other will become secondary. The primary router will poll the secondary for information.
  6. Exchange:Routers will describe their entire link-state database by sending database description packets. At this state, packets could be flooded to other interfaces on the router.
  7. Loading: At this state, routers are finalizing the information exchange. Routers have built a link-state request list and a link-state retransmission list. Any information that looks incomplete or outdated will be put on the request list. Any update that is sent will be put on the retransmission list until it gets acknowledged.
  8. Full:At this state, the adjacency is complete. The neighboring routers are fully adjacent. Adjacent routers will have a similar link-state database.

If the OSPF adjacency is not coming up, stuck in transit state when you are executing command <#show ip ospf neighbor> command on cisco IOS cli:

For OSPF to build a neighbor relationship, a few requirements have to be met:

  1. Routers must be in the same subnet.
  2. Hello and dead timers must match.
  3. Router IDs must be unique.
  4. Routers must be in the same area.
  5. Stub flag must be identical.
  6. IP MTU must be identical.
  7. Must pass neighbor authentication (if configured).

15

Flapping causes area to recalc SPF.  minimize by

  • OSPF schedule Delay – 5 seconds after receiving LSU/LSAs
  • Hold time – wait minimum of 10 seconds before running another SPF

 Common OSPF adjacency fail issues:

  1. OSPF Neighbor table does not display adjoining router.
  2. OSPF Neighbor status is stuck in INIT state.
  3. OSPF Neighbor status is stuck in 2-way state.
  4. OSPF Neighbor status is stuck in EXSTART/EXCHANGE state.

1. OSPF neighbor table does not display adjoining router :

This is a major problem in OSPF network. If the neighbor table does not display the adjoining router, it means either Hello packets are not being exchanged or it is being blocked or dropped between the two routers. There could be various reasons behind this behavior.
It could be a layer 1 / 2 problem or a configuration mistake.

Some reasons why an OSPF neighbor table does not display the adjoining router as its neighbor are:

  1. OSPF is not enabled on the router
  2. OSPF is not enabled on the interface
  3. OSPF interface is down. Layer 1 /2 problem
  4. Area ID mismatch between the interfaces
  5. Subnet mask mismatch between the interfaces
  6. Hello and Dead timer configured on the routers do not match
  7. OSPF authentication is enabled on one router and disabled on another
  8. OSPF authentication-mode configured on both routers do not match
  9. OSPF authentication-key configured on both routers do not match
  10. OSPF interface is configured as silent-interface
  11. ACL is blocking OSPF traffic
  12. Stub/NSSA flag is set on one router and not set on another router
  13. Same Router ID configured on both routers
  14. Different network-type configured under interfaces
  15. Neighbor command is not configured on remote router with broadcast
  16. OSPF neigborship is not build on primary address
  17. No Corrupted OSPF packet received.
  18. passive interface is not configured under “router ospf”.
  19. Virtual -link is not configured over stub area.

2. OSPF stuck in INIT (one way hello) 

  1. Multicast is broken or layer 2 problem.
  2. Access-list is blocking ospf multicast address.
  3. OSPF hello packet getting NAT translated.
  4. Layer 1/2 is broken.
    1. Unplugged cable
    2. Loose cable
    3. Bad cable
    4. Bad transceiver
    5. Bad port
    6. Bad interface card
    7. Layer 2 problem at telco in case of a WAN link
    8. Missing clock statement in case of back-to-back serial connection

3. OSPF stuck in 2-WAY

  1. Normal on ethernet broadcast.
  2. Layer 2 is broken.
  3. All routers are configured with priority 0 so there will not be any election.

4. OSPF stuck in EXSTART/EXCHANGE

  1. MTU mismatch between neighbor
  2. Duplicate router-ID between routers.
  3. Packet loss can also cause to stuck.
  4. Access-list is blocking unicast communication between router.

5. OSPF stuck in LOADING

  1. Neigbor is sending bad packet or corrupt packet due to memory.
  2. LS request packet is not accepting by neighbor and ignoring.

OSPF Neighbor Issues:

 You should be aware of the fact that OSPF calculates the shortest path for data by getting information from the nearby routers to figure out which path is the shortest for the data to travel. This connection with the neighboring devices is very important as the data is transmitted due to the information sent from one router to the other.

However, if there is some problem with the connection between two devices, OSPF will not be able to identify the shortest path. This could lead to delays in the transfer of data and reduced speed of network.

In order to troubleshoot this problem, you need to make sure that all of the requirements which are responsible for the establishment of a connection between two routers are met.

Firstly, you must make sure that both of the OSPF devices such as routers are on the same subnet. This could be done by checking their subnet mask to see if it is same for both.

In addition to this, both of these devices are supposed to be in the same area so they are able to make a connection. Most importantly, you should check whether these OSPF devices have been provided their unique Router-IDs. These IDs are important to identify each router separately in a network. Once of all these conditions have been checked and resolved, OSPF will start to function again normally.

OSPF Routing Table Issues

OSPF makes use of routing tables to identify the shortest distance for data to travel. The routing table comprises of all the information regarding the position of each router, distance between each router and sometimes the direction of each router from the other one as well. OSPF makes use of this information to calculate the shortest route for data to reach its destination. However, sometimes the routing details from the routing table are deleted. These details can relate to the external as well as the internal routes. Under such conditions, the OSPF is not able to function properly.

In order to eliminate this issue, one first needs to identify which routes have been deleted or have been damaged in the routing table. If all of the OSPF routes have been deleted from the routing table, the problem is quite crucial. In this case, you will have to carry out a full adjacency check. If only the external routes have been deleted, routes which have been originated from another process during routing, you need to carry out an external LSA check. If the summary routes, routes which originate from another area, or NSSA routes have been deleted. It is crucial that the routing tables are repaired in the shortest possible time so that the network could get back to normal and OSPF could start working properly again.

OSPF INIT state issue

INIT state means that one of the router is able to send OSPF “hellos” to the neighboring router, but the neighboring router is not able to send back “hellos”. This could lead to the problem of communication between the two routers which would prevent OSPF from performing its task. If this problem arises, you first need to check if the OSPF authentication is being used in both of these devices. You will have to make use of the “show IP OSPF interface” command in order to check this. If you figure out that the same authentication is being used, then you should check whether the same authentication keys are used by both of these devices. However, if you find out that the authentication type on both of these devices is not same, you should check whether the physical cabling has been done properly. In this regard, you should also check if the switch settings have been done properly. If you find issues in any of these steps, you should immediately resolve them as mentioned.

ACL Issues

The primary purpose of Access Control List (ACL) is to filter the data when it passes through a router. However, the process of ACL interferes with OSPF and prevents it from working properly. Hence, you need to check whether any of the routers are configured with ACLs. You could check this by using the command “show ip interface”. If you find out that if ACL is configured in any of the devices or routers, you should immediately disable the ACL to check if the OSPF starts working fine again. If it does, then you could reconfigure the ACL settings to ensure that it does not interfere with OSPF anymore.

OSPF PACKET TYPES

 There five different types of packet

  1. Hello
  2. Database Description
  3. Link State Request
  4. Link State Update
  5. Link State Acknowledgment
  6. Hello

I will start with OSPF hello packet as this is first packet that will be send as soon as we enable OSPF on an interface.

The Hello protocol serves several purposes:

  • It is the means by which neighbors are discovered.
  • It advertises several parameters on which two routers must agree before they can become neighbors.
  • Hello packets act as keepalives between neighbors.
  • It ensures bi-directional communication between neighbors.
  • It elects Designated Routers (DRs) and Backup Designated Routers (BDRs) on Broadcast and Nonbroadcast Multiaccess (NBMA) networks.

OSPF-speaking routers periodically send a Hello packet out each OSPF-enabled interface. This period is known as the HelloInterval and is configured on a per interface basis. Since OSPF is an open standard every vendor uses the same hello and dead time interval ie hello packets every 10 seconds (broadcast and point-to-point networks) and 30 seconds (nonbroadcast multiple access (NBMA) networks) and dead interval is four times the hello.

Packet structure of hello packet

2

Network Mask: is the address mask of the interface from which the packet was sent. If this mask does not match the mask of the interface on which the packet is received, the packet will be dropped. This technique ensures that routers will become neighbors only if they agree on the exact address of their shared network.

Hello Interval: as discussed earlier, is the period, in seconds, between transmissions of Hello packets on the interface. If the sending and receiving routers don’t have the same value for this parameter, they will not establish a neighbor relationship.

Options: This field is included in the Hello packet to ensure that neighbors have compatible capabilities. A router may reject a neighbor because of a capabilities mismatch.

Router Priority is used in the election of the DR and BDR. If set to zero, the originating router is ineligible to become the DR or BDR.

Dead Interval: is the number of seconds the originating router will wait for a Hello from a neighbor before declaring the neighbor dead. If a Hello is received in which this number does not match the Router DeadInterval of the receiving interface, the packet will be dropped. This technique ensures that neighbors agree on this parameter.

Designated Router: is the IP address of the interface of the DR on the network (not its Router ID). During the DR election process, this may only be the originating router’s idea of the DR, not the finally elected DR. If there is no DR (because one has not been elected or because the network type does not require DRs), this field will be set to 0.0.0.0.

Backup DR: is the IP address of the interface of the BDR on the network. Again, during the DR election process, this may only be the originating router’s idea of the BDR. If there is no BDR, this field is set to 0.0.0.0.

Neighbor: is a recurring field that lists all neighbors on the network from which the originating router has received a valid Hello in the past RouterDeadInterval.

 Packet capture of HELLO packet

3

Lets just quickly analyze the packet capture of an hello packet which is received from its neighbour, you can see that OSPF hellos are sent on the multicast address 224.0.0.5, correspondingly it uses 01:00:5e:00:00:05 as the multicast mac address for layer 2 resolution.

Moving into the actual hello packet header you will see that its OSPF version 2 which means its using native ipv4 communication, its message type is HELLO PACKET, packet length is 48 , router id is 1.0.0.2 , area id is 0, packet checksum , authentication details (here you can see that you can actually see the passwords (cleartex) because i am using cleartext authentication ,

If i would have used MD5 authentication we will not see the password because it will be encrypted) , network mask , hello/dead interval which is 10 and 40, DR /BDR and active neighbour .

It makes more sense to show you a capture where the DR/BDR election is in progress.

4

Here you can see that BDR field is empty and DR field is 10.0.0.2 , in fact If there is no DR/BDR, an election is held in which the router with the highest priority becomes the BDR. If more than one router has the same priority, the one with the numerically highest Router ID wins. If there is no active DR, the BDR is promoted to DR and a new election is held for the BDR.
It should be noted that the priority can influence an election, but will not override an active DR or BDR. That is, if a router with a higher priority becomes active after a DR and BDR have been elected, the new router will not replace either of them.

Database Description

 The Database Description packet is used when an adjacency is being established .The primary purpose of the DD packet is to describe some or all of the LSAs in the originator’s database so that the receiver can determine whether it has a matching LSA in its own database. This is done by listing only the headers of the LSAs. Because multiple DD packets may be exchanged during this process, flags are included for managing the exchange via a master/slave polling relationship.

5

Interface MTU: Interface MTU is the size, in octets, of the largest IP packet that can be sent out the originator’s interface without fragmentation. This field will be set to 0x0000 when the packet is sent over virtual links.

Options: The field is included in the Database Description packet so that a router may choose not to forward certain LSAs to a neighbor that doesn’t support the necessary capabilities.

I-bit: Initial bit, is set to 1 when the packet is the initial packet in series of DD packets. Subsequent DD

packets will have I-bit = 0.

M-bit: More bit, is set to 1 to indicate that the packet is not the last in a series of DD packets. The last

DD packet will have M-bit = 0.

MS-bit: Master/Slave bit, is set to 1 to indicate that the originator is the master (that is, is in control of the polling process) during a database synchronization. The slave will have MS-bit = 0.

DD Sequence Number: It ensures that the full sequence of DD packets are received in the database synchronization process. The sequence number will be set by the master to some unique value in the first DD packet, and the sequence will be incremented in subsequent packets.

LSA Headers: It list some or all of the headers of the LSAs in the originator’s link state database. See “The Link State Header,” for a full description of the LSA header; the header contains enough information to uniquely identify the LSA and the particular instance of the LSA. Packet capture of Database Description packet

6

Link State Request

 As Database Description packets are received during the database synchronization process, a router will take note of any listed LSAs that are not in its database or are more recent than its own LSA. These LSAs are recorded in the Link State Request list. The router will then send one or more Link State Request packets asking the neighbor for its copy of the LSA.

7

Link State Type is the LS type number, which identifies the LSA as a router LSA, network LSA, and so on.
Link State ID is a type-dependent field of the LSA header.
Advertising Router is the Router ID of the router which originated the LSA.

Packet capture of Link State Request:

8

Link State Update

 The Link State Update packet is used in the flooding of LSAs and to send LSAs in response to Link State Requests. Recall that OSPF packets do not leave the network on which they were originated. Consequently, a Link State Update packet, carrying one or many LSAs, only carries the LSAs only one hop further from their originating router. The receiving neighbor is responsible for re-encapsulating the appropriate LSAs in new LS Update packets for further flooding.

9

Number of LSAs specifies the number of LSAs included in this packet.

LSAs are the full LSAs as described in OSPF LSA formats. Each update may carry multiple LSAs, up to the maximum packet size allowed on the link.

Packet capture of Link State Update packet

10

Link state update is a very important packet type in terms of troubleshooting , couple of things that you need to remember is that LSU are received  on the multicast address 224.0.0.5 since there was a Link state request , requesting for LSA , R1 will receive an unicast LSU . You will also see the number of  LAS’s and the associated links in the LSA.

Link State Acknowledgment

 Link State Acknowledgment packets are used to make the flooding of LSAs reliable. Each LSA received by a router from a neighbor must be explicitly acknowledged in a Link State Acknowledgment packet.

11

Packet capture of Link State Acknowledgment packet

12

  • OSPF has protocol ID 89for all its packets.
  • If we use debug ip ospf packetwe can look at the OSPF packet on our router. Let’s look at the different fields we have:
  • V:2 stands for OSPF version 2. If you are running IPv6 you’ll version 3.
  • T:1 stands for OSPF packet number 1 which is a hello packet. I’m going to show you the different packets in a bit.
  • L:48is the packet length in bytes. This hello packet seems to be 48 bytes.
  • RID 1.1.1.1 is the Router ID.
  • AID is the area ID in dotted decimal. You can write the area in decimal (area 0) or dotted decimal (area 0.0.0.0).
  • CHK 4D40 is the checksum of this OSPF packet so we can check if the packet is corrupt or not.
  • AUT:0 is the authentication type. You have 3 options:
    • 0 = no authentication
    • 1 = clear text
    • 2 = MD5
    • AUK:If you enable authentication you’ll see some information here.

 Cisco Commands to use for troubleshooting:

Reason for Neighbor Adjacency Problem Commands for Diagnosing the Problem
 To view OSPF information including:

The process ID
The local router ID and its role (such as DR or BDR)
Configured areas

show ip ospf
To view interfaces that are running OSPF including the following information:

Interface status and IP address assigned to the interface
Area number
Process ID
Router ID
The router ID and IP address of the DR and BDR on the network
Hello and dead timer settings
Adjacent routers

show ip ospf interface
To view information about neighbor OSPF routers including:

Router ID of the neighbor router
Neighbor state or status (the Full state indicates that the DR/BDR election has occurred and they are exchanging routing information)
The role of the neighbor (DR, BDR, DROTHER)
Time remaining before the neighbor is declared missing if a hello packet is not received
The IP address of the neighbor
The local interface used to reach the neighbor

show ip ospf Neighbor
To view OSPF configuration information such as:

The OSPF process ID
The OSPF router ID for the current router
Configured networks and areas for the process
IP addresses of neighbor routers

show ip protocols
MTU mismatch between neighboring interfaces. show interface <int-type><int-num>
OSPF area-type is stub on one neighbor, but the adjoining neighbor in the same area is not configured for stub.  show ip ospf interface
OSPF neighbors have duplicate Router IDs.  show ip ospf interface
OSPF is configured on the secondary network of the neighbor, but not on the primary network. This is an illegal configuration which prevents OSPF from being enabled on the interface. show ip ospf interface 
OSPF HELLOs are not processed due to a lack of resources, such as high CPU utilization or not enough memory. show memory summary show memory processor
An underlying Layer problem prevents OSPF HELLOs from being received. show interface
To view debugging information about hello exchanges, DR selection information, SPF calculation, and errors related to negotiating adjacency.

Use debug ip ospf hello to view only hello packet information.
Use debug ip ospf adj to view adjacency information.

debug ip ospf events
Displays information contained in each OSPF packets such as area id and router id. debug ip ospf packet
Shows and Area Border Routers (ABRs) routing table. show ip ospf border-routers
Shows the state of adjacency and the neighbor routers ID show ip ospf neighbor

 

Displays information on the Area to which it is assigned. Can be used to display information on the Area Border Router or Autonomous System Boundary Router. show ip ospf process-id

 

Shows routers link state and network link states as maintained in the routers database. Show ip ospf database

 

Juniper Commands:
show ospf neighbor
show ospf neighbor extensive
clear ospf neighbor all
show ospf statistics
show ospf interface
show ospf interface extensive
show route protocol ospf
show ospf database
show ospf database router advertising-router <x.x.x.x>

 OSPF Facts:

  1. Highest IP address ABR routes convert the type7 into type 5.
  2. Default route is not generated by default in area nssa unless “are nssa <> default originate ” configured.
  3. Totally stubby NSSA area generate the default route by default.
  4. DR/BDR does not support the preempt therefore if DR fails BDR will become DR new BDR will be connected. DR does not become DR even when it is high priority.
  5. With “ip ospf priority 0” router does not participate in DR/BDR.
  6. OSPF behaves as distance vector protocol when multiple area in use.
  7. Highest priority/IP address becomes the DR/BDR.
  8. OSPF hellos are always send from primary interface.

Most error messages shown in the debug output adequately describe the nature of the problem.

Shown below are some errors that display with the debug ip ospf events command:

Error Meaning
OSPF: mismatched hello parameters from 10.0.0.1
OSPF: Dead R 20 C 40, Hello R 5 C 5
Mask R 255.255.255.0 C 255.255.255.0
Hello timer, dead timer, or subnet mask mismatch detected.
In this example, the dead timer intervals do not match:

R (received) = 20, C (configured) = 40

OSPF: hello packet with mismatched E bit Area types (not area numbers) configured on each router do not match.
The E bit is also called the stub area flag.
Neighbor Down: Dead timer expired An expected hello timer has not been received.
When the dead timer reaches 0, it is assumed that the neighbor router has gone down.
The dead timer resets itself each time a hello packet is received.

Document to refer for more details:

 RFC 2328 OSPFv2
RFC 2178 OSPF
RFC 1583 OSPF v2
RFC 1587 OSPF NSSA
RFC 1745 OSPF Interactions
RFC 1765 OSPF Database Overflow
RFC 1850 OSPF Traps
RFC 2154 OSPF w/Digital Signatures (Password, MD-5)
RFC 1850 OSPF v2 MIB
RFC 1997 Communities Attributes
RFC 2385 TCP MD5
RFC 2370 OSPF Opaque LSA Option

 

Advertisements

Categories: OSPF, Routing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s