OSPF convergence - WAN links

rganascim · December 17, 2010, 1:33pm

Hi all,

I have a network with a lot of FastEthernet WAN connections (some
metro-ethernet), and using the OSPF as IGP. Today, the OSPF timers are the
defaults (hello 10s, dead 40s, SPF initial timer 5s, etc). When a link comes
down, the convergence time takes ~45s (ok, it's right).
There are a lot of documents explaining about tuning OSPF convergence time,
but on LAN environments. I didn't find any references about this OSPF tuning
on WAN ethernet links (just serial, frame-relay, etc) and things related to
it (such as packet loss, rtt, 'never lost of carrier', etc).

I think that, if the timers are aggressive, any flap on the ISP network can
cause a re-convergence... if the timers are high, the convergence time on
down links is high too.

What factors are you considering when tuning this OSPF timers on this type
of link? What 'tecnologies' are you using (such as Fast Hellos, incremental
SPF, etc) ?

Thanks,

Rafael

Jeff_Saxe · December 17, 2010, 1:59pm

If your routers support Bidirectional Forwarding Detection (BFD), then I would suggest using that. It doesn't actually modify the hello timers or any other timers of any protocol; it merely acts as a supplementary protocol running under (or alongside, I guess) the main routing protocol, and its specialty is detecting failure along MAN circuits, virtual circuits, Ethernet VLANs, and other kinds of circuits in which you don't actually see a Link Down when the circuit is interrupted.

http://en.wikipedia.org/wiki/Bidirectional_Forwarding_Detection
http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/fs_bfd.html

The nice thing about BFD is that it's optional on a per-interface and per-peer basis, so you don't have to tune the hellos aggressively for the other 10 peers who are on simple, local, or serial links just to get quicker failure detection for that one guy who's off on a virtual circuit. So in your case, on a Cisco for example, it might be as simple as

interface Gig0/0/2
bfd interval 800 min_rx 50 multiplier 7
ip ospf bfd

(You'd have to do this on both sides of the link.) Now when the OSPF neighbor is established, it will also try to establish a back-and-forth-packet BFD session to the same OSPF neighbor, and if it does successfully establish it, then it will keep sending packets every 800 milliseconds. Then if the BFD packets stop coming for 800ms*7=5.6 seconds, then BFD will inform OSPF that the neighbor is down, even if OSPF hasn't "naturally" discovered that yet. "show bfd neighbors" to see whether it's working. You can tune BFD very aggressively if you want... 50ms intervals and multiplier of maybe 5 or 3 I think, so you can make a failure occur in 250ms or less, if the application is that sensitive.

In my experience it works great and does exactly what it's designed for. I use it for BGP peers within my AS and with some customers.

-- Jeff Saxe
Blue Ridge InternetWorks
Charlottesville, VA