We're having discussion of changing BGP timers rather than using BFD and I'd like to ask for your operational experiences on this.
We have downstream BGP customers physically attaching to an L2/L3 switch that doesn't do BGP. So, we logical pipe them through MPLS to a router that can terminate the BGP session. The logical pipe never goes down, so the only thing that would cause the customer's session to go down in the event of a physical layer problem is the BGP timer.
This is not acceptable, so I have been using BFD to time out the BGP session. However, we have limitations on the BFD pps and folks here are wanting to change the BGP timers instead.
What're your experiences regarding this?
I've had no problems with it. We also have routers attached to Ethernet (both our own switches and external Layer 1 or Layer 2 Ethernet private circuits), and we had similar problems of uncomfortably long time-to-detection. Our routers were too old to run BFD, and I'm not sure what the likelihood is for asking an outside provider to perform BFD with us, so I just configured the BGP timers to much smaller. I chose what I believe was the minimum on our Cisco equipment at the time, keepalives every 10 seconds and die after 30 seconds. I have had no ill effects at all (no spurious BGP down/ups in the middle of the night), and it has actually shortened the detection time in one or maybe two unexpected failures, so I'd call it a success.
router bgp 22070
timers bgp 10 30
This is global to the BGP process (i.e., all neighbors default), but there also appears to be a "neighbor x.x.x.x timers" command that can tweak it per neighbor. Note that you have to make the timers change in a maintenance window; BGP timers are negotiated between peering routers at the start of the BGP session, so changing the values might result in closing and reestablishing all those peers. Also note that a peer can declare a minimum acceptable hold time that they will accept from you, so if you would prefer the session to die after 30 seconds, but one of your peers says that's too short, I guess it's possible that the BGP session would try to come up and fail, over and over. None of our external peers objected when we set ourselves to 10 and 30.
We do have more modern routers now, so maybe I should get off my behind and try BFD. I'm probably behind the curve here.
-- Jeff Saxe, Network Engineer
Blue Ridge InternetWorks, Charlottesville, VA
CCIE # 9376
434-817-0707 ext. 2024 (work) / 434-882-3508 (cell) / JSaxe@briworks.com