Regarding global BGP community values

To go forward, I'd like to say here that the new draft are prepared now
concerning the global communities.

While it's clear that a considerable amount of disagreement exists regarding
transitive communities dynamically doing things, it's extremely simple for
providers to just not pay attention to them.

Another potential application for global transitive communities, which is likely even more debatable than path selection issues, is using them in conjunction with MEDs and "more specifics" of provider aggregates (to fix some of the brokenness of aggregates and MEDs) in order to provide a safety net for potential route leaking.

This could be advantageous for several reasons. I think msot of us agree that a mechanism to re-introduce intelligence to "best-exit" type routing configurations is a good idea. It's a good idea not only because some providers want to perform "best-exit" as a value-add to their services, but also because it makes sense in order to provide the ability to compensate a peer (who's fussing about settlements and traffic asymmetries) by carrying the traffic longer on your network. It could also assist in more optimally regionalizing traffic exchange between networks, especially with the ever-growing geographically distributed inter-connectivty provided by direct interconnections.

The offshoots with providing more specifics to peers are obvious, I believe.

One problem is potential significant growth in routing and forwarding tables sizes, which was one of the primary drivers for aggregation techniques in the first place. If this is a problem, the provider can always opt to not accept the more specifics from the peer.

The other problem I can think of at the moment, which is likley more of a concern for most folks, is wrt providers leaking more specifics, either via BGP customers, or directly. This could be a concern because perceived "clue" of a peer, as well as simple errors in configurations, etc... This can be addressed to some extent by providing drafts that discuss these issues.

Then, there are the problems those accepting MEDs have regarding a networks ability to associate intelligent values with MEDs, or provide a *only* reasonable number of prefixes (versus "more specifics" expanding to thousands of /24s and longer). Of course, a large piece of this would be reliant upon a decent IP allocation plan that at worst provides router-based aggregates for more specifics, and preferrably PoP-based. This is difficult, of course, with all the older networks, acquisitions, etc...

Anyways, if a set of transitive communities were defined to provide a safety net that could catch the more specifcs, or some other mechanism were created to provide the same capabilities, I'd be interested.

I believe AboveNet and a few others actually have experience with accepting more specifics, and since I missed the BOF in Montreal (and no information is available on the web server as of yet?), I'd be interested in hearing what folks oinions are regarding this.

-danny

One problem is potential significant growth in routing and forwarding
tables sizes, which was one of the primary drivers for aggregation
techniques in the first place. If this is a problem, the provider can
always opt to not accept the more specifics from the peer.

The growth itself do not cause the problems, but in conjunction with the
poor router implementation (which cause 60,000 routes to use 30 MB of the
RAM - that means 500 bytes for every prefix -:slight_smile: and numerous memory leaks
in the router implementation cause the problem. If we look around, we'll
see existing computers (including embedded ones) have not CPU and memory
problems, and all problems we see with the routers are mainly caused by
the bad implemented text.

On the other hand, you are right if you speak about the stability or
loop-less routing - extra specifics cause a lot of instability.

But it's slightly out of this issue.

Talking about the global transitive communities, we should mention one
existing problem. Communities are used now for boths internal and global
control (I make my peering announces as 2118:11, I mark those announces
which should not be advertised to the other peers, as 2118:12, for
example, TELIA use communities widely, and so on). On the other hand, we
have not any mechanism how to filter communities out when we advertise
prefixes - in case of CISCO, I can -
- don't send communities at all
- set new communities instead of existing
- add new communities to the existing

This is the chance to see the growing number of useless communities if we
introduce the set of transitive ones.

And this make some mechanism of _community filtering_ very desirable.
Note, this days we see the turn from the AS-based routing to the
community-based and MED-based, because:
(1) AS-es themself do not provide any protection against the mistakes -
they are not used in the routing; this cause prefix-based filtering very
desirable (at least at the downstream links);
(2) AS list growth quickly, and (even if we build access lists by the RIPE
or RA-DB or <ANY>...-DB data base) we can't maintain such big pieces of
configuration;
(3) AS-base control restricts the main principle of the effective routing
control - _analyze everything careful, but ONCE; then add your labels and
use this labels_. The communities are one type of such _labels_.

And, if we are facing to the some future BGP-5 protocol, and remembering
about the compatibility, the new terms _local community, global community,
transitive communities_ (replace _community_ to any other world, if you
want) became very desirable. Note - we just have _PRIVATE-AS_ and some
ways to filter them out; now it's time to have PRIVATE_COMMUNITIES as
well.

The other problem I can think of at the moment, which is likley more
of a concern for most folks, is wrt providers leaking more specifics,
either via BGP customers, or directly. This could be a concern

Note - it's often when we leak such specifics _on purpose_. For example,
see 144.206/16 - we should leak some specifics from this block to make
routing _correct_ (some branchs have commercial-quality access, some
branches have not).

On the other hand, the more you restrict allowed (in the Internet)
prefixes, the less effectively you does use address space. This is the
stick with the two ends. I believe we should see more and more /20, /21
and even /24 prefixes in the network in the next few years - because the
CPU and memory could be increased easily, but the address space can not.

Alex (Roudnev).

"Alex P. Rudnev" wrote:

The growth itself do not cause the problems, but in conjunction with the
poor router implementation (which cause 60,000 routes to use 30 MB of the
RAM - that means 500 bytes for every prefix -:slight_smile: and numerous memory leaks
in the router implementation cause the problem. If we look around, we'll
see existing computers (including embedded ones) have not CPU and memory
problems, and all problems we see with the routers are mainly caused by
the bad implemented text.

I, and the rest of the Internet community, would like to invite you to start a
router company and show us how it can be done with far less memory.

:wink:

More seriously, you might take a look around and note that there are not a
great deal of difference in the amount of memory needed to support a prefix
across the various well-known implementations. Which is not to say that we're
blameless, just that a lot of good people have worked hard and are all equally
incompetent at conserving memory while simultaneously producing a scalable,
stable, feature-rich implementation.

Regards,
Tony

> see existing computers (including embedded ones) have not CPU and memory
> problems, and all problems we see with the routers are mainly caused by
> the bad implemented text.

I, and the rest of the Internet community, would like to invite you to start a
router company and show us how it can be done with far less memory.

Sorry, I forget -:); on the other hand, if you want to build the router
wasting 8 bytes for every BGP prefix, you no doubt do it (don't asnwer
_buy more memory instead, it's cheaper_ - no one object this).

Speaking about the CISCO's, no one thought about the
memory when realised BGP there; the worst failures in the CISCO history
was caused by some _temporary_ prefix leaks which caused routers to eat
memory _permanently_ (last case was in our network 1 week ago when we
leaked extra 20,000 prefixes to our access routers; it was fixed in a 5
minutes, but more then half of them get stomachache and refuse to
work even when this leak disappeared... I don't blame the
software
designers, they must found the compromise between the stability,
time_to_implement, cost and memory, but I'd like to highlight that they
really did not concerned
about such _cheap_ thing as memory at all). (let me to put -:slight_smile: here).

But you hided my idea that the less prefixes we allow to be in the global
Internet, the less effectively we use address space; memory can be
upgraded (not easily due to bad router's design, through /compare with the
PC, and you should aggreee), the address space can not at all. This means
we are facing to the growth routing tables no matter if we dislike it.

Alex.

Regards,
Tony

Aleksei Roudnev, the head of Network Operations Center, Relcom, Moscow
(+7 095) 194-19-95 (Network Operations Center Hot Line),(+7 095) 230-41-41, N 13729 (pager)
(+7 095) 196-72-12 (Support), (+7 095) 194-33-28 (Fax)

Speaking about the CISCO's, no one thought about the
memory when realised BGP there; the worst failures in the CISCO history
was caused by some _temporary_ prefix leaks which caused routers to eat
memory _permanently_ (last case was in our network 1 week ago when we
leaked extra 20,000 prefixes to our access routers; it was fixed in a 5
minutes, but more then half of them get stomachache and refuse to
work even when this leak disappeared... I don't blame the
software
designers, they must found the compromise between the stability,
time_to_implement, cost and memory, but I'd like to highlight that they
really did not concerned
about such _cheap_ thing as memory at all). (let me to put -:slight_smile: here).

On behalf of {myself, Paul, Ravi, Enke}, I assure you that Cisco's BGP has _always_ been
worried about conserving memory.

Tony

> work even when this leak disappeared... I don't blame the
> software
> designers, they must found the compromise between the stability,
> time_to_implement, cost and memory, but I'd like to highlight that they
> really did not concerned
> about such _cheap_ thing as memory at all). (let me to put -:slight_smile: here).

On behalf of {myself, Paul, Ravi, Enke}, I assure you that Cisco's BGP has _always_ been
worried about conserving memory.

BGP - yes, total architecture - not at all. Even very simple ensuranses
_don't allow the process eating already 90% of the memory to eat last 10%_
and _defragment the garbage_ was not realised, and if some (BGP for
example) process became crazy and over-eat something, not one can even
log-in and say _reload_ -:).

Tony

Aleksei Roudnev, Network Operations Center, Relcom, Moscow
(+7 095) 194-19-95 (Network Operations Center Hot Line),(+7 095) 230-41-41, N 13729 (pager)
(+7 095) 196-72-12 (Support), (+7 095) 194-33-28 (Fax)