Route table growth and hardware limits...talk to the filter

Date: Tue, 11 Sep 2007 10:34:20 -0500
From: "Church, Charles" <cchurc05@harris.com>
Sender: owner-nanog@merit.edu

I'm willing to bet that Cisco's sales team is pushing back a little on
this enhancement. Given the choice between prolonging the usefulness of
Sup2 for another year or two versus selling a massive amount of new
Sup720-3BXLs, I'm betting they'll do the later.

Yes, but higher management should understand that when customers feel
that they are being forced to buy expensive hardware because a software
"fix" is deemed bad for sales, they are likely to buy someone other
companies products.

I was playing with a sup2 adding in extra routes to the point that it ran out of memory. Unfortunately, it didn't just drop routes like I thought it would. CEF disabled itself as well, which on a busy box would be a disaster.

Is this what people expect will happen in a few months to people using sup2s? Or am I missing something else?

-Matt

That's not good. What software version was it running?

While it is not good, the alternative approach would leave an indeterminate
routing table in hardware. Would you like the packets to go to randomized
directions?

The box is trying to do the right thing by turning off CEF and switching
everything in software since in this case software is the only entity in the
system with a consistent FIB.

An alternate would be to use the hardware forwarding tables as a limited
size cache (similar but not exactly as in the 7000 router). I am sure that
this is a large software effort and whether the hardware can support this is
questionable.

SUP2 was a great RP with a really long life, but maybe it is time to move on
to a SUP720 with the large table option and then grab a cold one :wink:

Cheers,

Bora

I was playing with a sup2 adding in extra routes to the point that it ran out
of memory. Unfortunately, it didn't just drop routes like I thought it would.
CEF disabled itself as well, which on a busy box would be a disaster.

Is this what people expect will happen in a few months to people using sup2s?
Or am I missing something else?

That's not good. What software version was it running?

While it is not good, the alternative approach would leave an indeterminate
routing table in hardware. Would you like the packets to go to randomized
directions?

No, but someone previously posted that with later software versions, when TCAM runs out, packets for those routes that fit in TCAM are hardware switched, and only traffic for the remaining routes that didn't fit are software switched. That could potentially go unnoticed for some time, while software switching all traffic is likely be impossible on many installations. I kind of doubt the MSFC2 can software switch gigabits/s of traffic (or anything close to gigabits/s).

SUP2 was a great RP with a really long life, but maybe it is time to move on
to a SUP720 with the large table option and then grab a cold one :wink:

Or start filtering some of the twit networks that totally deagg their CIDRs. I see a game of internet chicken in the near future...only some of the players don't realize they're playing.

That is what I thought as well, but I'm afraid the only MSFC2s I have are
attached to my SUP32s in production. And so destructive testing like this
would be, you know, bad. But it would be really good to know what to
expect. One possibility means that we might see occasional CPU spikes. The
other possibility means the box will start sucking more than this year's
factory Honda team. I suppose I could ask Cisco, but if anyone else has
done any testing it'd be good to hear about it...

Meanwhile, I have brought myself to three options:

1. Upgrade to RSP720-3CXL (same price, more memory, faster CPU compared to
SUP720-3BXL) + 6148s
2. Cisco 7304 + pair of 3750s
3. Juniper M7i + pair of 3750.

Even with need for the 6148s it's still cheaper for me to keep my 7604s
although not by too much. Now to get in to the nitty gritty. Imagine my
shock when my Cisco rep said he's been having a lot of these conversations
lately...

No, but someone previously posted that with later software versions, when
TCAM runs out, packets for those routes that fit in TCAM are hardware
switched, and only traffic for the remaining routes that didn't fit are
software switched.

that would have been me. and the comments on the logic still stand (as
correct) provided you are running the appropriately non-ancient release of
code.

cheers,

lincoln (ltd@cisco.com)

Do you know in what version this behavior was changed? At the very least, people are going to want to upgrade IOS, as it'll likely mean the difference between slightly increased MSFC CPU and a switch that can't cope.

..

Has the option of using default route(s) occurred to you?

Jon Lewis wrote:

Do you know in what version this behavior was changed? At the very least, people are going to want to upgrade IOS, as it'll likely mean the difference between slightly increased MSFC CPU and a switch that can't cope.

I looked at the IOS version I ran the test on it was quite old (12.1). I am happy to rerun the test with other versions of IOS. Any suggestions?

-Matt

Jon Lewis wrote:
> Do you know in what version this behavior was changed? At the very
> least, people are going to want to upgrade IOS, as it'll likely mean the
> difference between slightly increased MSFC CPU and a switch that can't
> cope.

I looked at the IOS version I ran the test on it was quite old (12.1). I
am happy to rerun the test with other versions of IOS. Any suggestions?

i'd suggest the most recent 12.2(18)SXF (its 12.2(18)SXF11) or 12.2(33)SXH.
i think you'll find you get different results than your original test.

cheers,

lincoln. (ltd@cisco.com)

Meanwhile, I have brought myself to three options:

Has the option of using default route(s) occurred to you?

welcome to v6. we forgot to sort out routing, so just don't do it.
you're kidding, right?

randy

No, I'm not kidding but maybe we're talking about a different thing (you may have a more generalized network in mind).

The way I see it, a network which is considering "Juniper M7i or Cisco 7300 plus a couple of switches" as an option does not _need_ 220K IPv4 routes in its routing table. Whether it has 150K, 40K (Hi Simon!) or 5K shouldn't matter that much from the functionality perspective.

If we still disagree, it might be interesting to hear why filtered BGP feeds from upstream and appropriately placed default routes to cover the holes wouldn't provide a functionally and operationally an equivalent solution?

Pekka Savola wrote:

it might be interesting to hear why filtered BGP feeds from upstream
and appropriately placed default routes to cover the holes wouldn't
provide a functionally and operationally an equivalent solution?

as we scale, if the vendors can't maintain simple/promised functionality
they seem to ask the customers to add complexity and hence unreliability
and ops cost or upgrade and upgrade and upgrade on the "we're working on
that for next year" path.

randy

Hello Pekka:

Meanwhile, I have brought myself to three options:

Has the option of using default route(s) occurred to you?

welcome to v6. we forgot to sort out routing, so just don't do it.
you're kidding, right?

No, I'm not kidding but maybe we're talking about a different thing (you may have a more generalized network in mind).

The way I see it, a network which is considering "Juniper M7i or Cisco 7300 plus a couple of switches" as an option does not _need_ 220K IPv4 routes in its routing table. Whether it has 150K, 40K (Hi Simon!) or 5K shouldn't matter that much from the functionality perspective.

If we still disagree, it might be interesting to hear why filtered BGP feeds from upstream and appropriately placed default routes to cover the holes wouldn't provide a functionally and operationally an equivalent solution?

Well, how do you determine which routes to select from each provider and what to cover with defaults? How do you modify those settings once they're in place, particularly when you find exceptions in your design? I know the answers, but these are not easy questions to answer if you are a small provider that is smart enough to have multiple transit providers and enough clue to configure .* and ^$, but not enough clue to filter based upon upstream provider communities, flows and/or other dynamic means.

The whole point of BGP, to my mind, is so that I *can* accept full routes from multiple providers and *may* elect to change that behavior for other reasons. I shouldn't have to modify my BGP configuration to support my vendors' inability to provide a device that can scale to the present demands of the global routing table. Last time I checked, they are here to support me, not the other way around.

Regards,

Mike

Well, how do you determine which routes to select from each provider and what to cover with defaults? How do you modify those settings once they're in place, particularly when you find exceptions in your design? I know the answers, but these are not easy questions to answer if you are a small provider that is smart enough to have multiple transit providers and enough clue to configure .* and ^$, but not enough clue to filter based upon upstream provider communities, flows and/or other dynamic means.

One approach is to accept everything up to some prefix length, e.g., /16, /20, /21 or whatever, and filter the rest -- and point the primary default route to a non-tier1 so that it should _always_ have connectivity to all the world.

Now, I guess one question is, what do you do when your tierN upstream you point default route to has routing or forwarding broken in such a way that your packets get dropped? Answer: you fix it manually or just ignore it and get SLA credits. However, very probably the same problem (e.g., BGP works but forwarding broken) would happen with full BGP feeds as well, so it's not like you're losing much.

(FWIW, not sure if such a small provider needs other than default route and potentially routes of networks directly attached to its upstreams, but filtered full feeds may be a more politically correct approach for network administrators)

The whole point of BGP, to my mind, is so that I *can* accept full routes from multiple providers and *may* elect to change that behavior for other reasons. I shouldn't have to modify my BGP configuration to support my vendors' inability to provide a device that can scale to the present demands of the global routing table. Last time I checked, they are here to support me, not the other way around.

But the vendors aren't unable -- AFAIK, such devices have been available on the market for, what, 7-8 years now? It's just your wallet that's unable to get equipment that's needed to face the network that's getting more complex.

In this case your choices seem to be a) dig out more money and get a better router, b) complain to vendor so that they make their implementations better (e.g., better memory or FIB utilization, transparently) so that you can continue doing exactly as before, at least for a while, c) change the configuration in such a manner that your gear remains viable for a longer while, or d) complain to IETF, ITU-T, ... or whoever to create a new protocol that would accomplish the same thing as b).

I don't oppose b) but I fail to see how that could provide more than a quick term fix as the number of routes is climbing and the mountaintop is nowhere in sight. Similarly, d) would take so long that it won't help you here. So your real options are either a) or c). Whether the drawbacks of letting go of full, unfiltered BGP feeds is worth the cost of a) is up to you.

But the vendors aren't unable -- AFAIK, such devices have been available on the market for, what, 7-8 years now? It's just your wallet that's unable to get equipment that's needed to face the network that's getting more complex.

Cisco reps are still selling SUP32's to people despite being told the customer wants to take full routes. That's either incompetence, or dishonesty.

It has nothing to do with budget. If you are told Product A will do the job and costs $100k, and product B will also do the job but costs only $50k, you'd be an idiot to go with product A.

Furthermore- Cisco doesn't have a product to meet their needs. A Sup32 with a 3bxl is what a lot of people need, but Cisco seems intent on forcing people to upgrade to a Sup720. That's just overkill for most people.

-Don

There are a couple of reasons:

1. The "captain obvious" suggestion of a default means that now I'm paying
for multiple links but can only use one. That's not cost effective and will
provide lower performance for some destinations. I have done defaults in
the past where appropriate but it's not appropriate in this application.

2. The idea of a complex filtering strategy is, from my perspective, an
even worse idea. You get all of the downsides of a default with increased
operational complexity that may not scale across multiple sites depending on
the size of your ops team. Oh, and don't forget, for testing and validation
you'd need to buy a router that can take these multiple feeds to test the
results of the filtering policy.

Both of those options are viable (#1 obviously over #2) if just basic
connectivity is required. However I find myself not really wanting to have
to continually support solutions with such limitations when there are other
options.

1. The "captain obvious" suggestion of a default means that now I'm paying
for multiple links but can only use one. That's not cost effective and will
provide lower performance for some destinations. I have done defaults in
the past where appropriate but it's not appropriate in this application.

That's not the case at all. If you use only defaults, you could do load balancing but in a very crude fashion. If you use a default route and filtered version of BGP feed (e.g., accept everything up to /21) probably up to 90-95% of traffic would go over that link, or multiple ones if you have multiple BGP sessions.

If you want more control than _only_ a default route or two (and many do), the default route would in principle be just a safeguard for more specifics (or other routes, based on a metric of your choosing) you filter out.

2. The idea of a complex filtering strategy is, from my perspective, an
even worse idea. You get all of the downsides of a default with increased
operational complexity that may not scale across multiple sites depending on
the size of your ops team.

I'd probably agree if you used complex filtering without a default route. Having a default route, as long as it points to a sufficiently good (non-tier1, not cogent) upstream allows you not to care so much about how you filter the BGP feed.

But as should be obvious, you don't need to worry about this problem if you're willing to put money into router upgrades. However, I'm just suggesting there is an alternative to router upgrades if you're comfortable with the somewhat different tradeoffs that will bring with it.

1. The "captain obvious" suggestion of a default means that now I'm paying
for multiple links but can only use one. That's not cost effective and will
provide lower performance for some destinations. I have done defaults in
the past where appropriate but it's not appropriate in this application.

That's not the case at all. If you use only defaults, you could do load balancing but in a very crude fashion.
If you use a default route and filtered version of BGP feed (e.g., accept everything up to /21) probably up to 90-95% of traffic would go over that link, or multiple ones if you have multiple BGP sessions.

Sure, but you do still run the (not insignificant) risk of following the default to the "sufficiently good (non-tier1, not cogent) upstream", only to discover that, for whatever reason, it has no reachability to the prefix. If I have spent to time and effort to get multiple providers, presumably I believe that my bits are important enough to not trust to "this will probably work most of the time..."

W