RE: Verio Peering Question

While I certainly support the idea of usable micro allocations, and have
voiced my support on various ARIN mailing lists for it, it should be
remembered that the same folks who generally espouse restrictive filtering
policies are also those who voice the greatest opposition to a realistic
micro allocation policy. Their argument normally underscores the somewhat
facetious issue of routing table size.

In hopes of correcting this somewhat, let me say that not only
am I a strong supporter of filtering, I have also suggested
fairly seriously to some registry-types that it is fair to allocate
individual /32s as necessary to contain address consumption.

That is, the registries are correctly focusing on that resource-management,
and should spend energies on reclaiming wasted space (hello MIT!)
rather than on managing multiple scarce resources. ISPs can and
will filter or otherwise penalize users of long and/or flappy prefixes
as dynamicism forces them to do so. Since filtering can become REALLY
aggressive if and as necessary, nobody should worry that ISPs will
be so overwhelmed that the RIRs have to help out with this problem.

Ad hominem attacks against individuals [are bad]

But but it's open season on "folks" who make "facetious" arguments?

  Sean.

Also sprach Sean M. Doran

I have also suggested fairly seriously to some registry-types that it
is fair to allocate individual /32s as necessary to contain address
consumption.

Praise the Lord! I'm not alone in this. I've always thought
prefix-length filtering was incredibly inane.

Since the length of the prefix bears little correlation to the
"importance" of the network being advertised, there is little reason to
filter based on the length of the prefix.

Filter flapping routes? Sure. Filter RFC1918 space? No brainer.
Filtering on prefix length? There's just no solidly backed up reasoning
to support it that I've heard. A network that has the operational need
to be multi-homed will add a route to the default-free area, its as
simple as that (barring some major architectural changes to the
protocols in use on the Internet.) If we're going to have the routes in
the default-free, let's at least try to minimize. Encourage
re-numbering into fewer blocks and returning those blocks to ARIN?

Here's a thought. When you give out a new block of IP addresses to an
organization, require that, within a certain time period (a year?) they
relinquish two blocks back to ARIN. Obviously there is going to be a
limit to how far you can go with this...if the organiztion only has a
single block, then it can't turn two in if its allocated a new block
since that would leave it with no space. A policy like this (and this
is obviously *extremely* rough) would have the effect of encouraging
re-numbering, and designing networks in ways that renumbering isn't
quite as onerous, it would also reduce the number of blocks being
advertised by the organization. IgLou, for example, has 6 blocks that
we advertise...we're pretty much small fry, but it would not be all that
difficult for me to free up several of my blocks for relinquishing back
to ARIN. Now, IgLou obviously isn't going to have a huge effect on
reducing routing table size, but over time, this would reduce the number
of routes in the default-free zone...or at least keep the growth in
check. The trick is that a network needs to be able to obtain network
blocks that are reasonably sized for their needs with little to no
hassle. There are all sorts of variations on this policy theme that
could be used to balance the needs of the Internet as a whole.

The bottom line, though, is that the current policies in use don't
address the problems that it is claimed that they were instituted to
address...primarily address depletion and routing table size.

In addition to that, the prefix filtering mechanisms that are being
discussed don't apply at all to providers who are allocated much larger
blocks (/16+) yet feel the need to slice and dice it into individual /20's
to steer traffic.

Can someone who is in favor of implementing filters explain to me why
slicing a /16 into 16 /20's is any different than slicing a /20 into 16
/24's? If the thought is "you're given a /20, i only want to see a
/20.." then why doesn't "you're given a /18, I only want to see a
/18" also apply?

I don't have any hard evidence to know how much of an impact this
actually has, but I would be very interested to see how many more specific
/19's and /20's exist in a "verio-filtered" table that were allocated as
/16's and shorter.

Paul

Can someone who is in favor of implementing filters explain to me why
slicing a /16 into 16 /20's is any different than slicing a /20 into 16
/24's?

with the understanding that i'm explaining why MIBH did it rather than
explaining why you should do it (if indeed you should, which is not even
a conversation i could find myself in), the reasoning is/was as follows.

there has to be a limit.

that's all. that's the full extent of the reasoning. some limit has to
be chosen, because without limits, human nature kicks in and "the people"
start voting themselves bread and circuses. i'm not interested in going
down that path because it's a recipe for _finding_ the limits of the
global routing system, both in table size and in delta volume. (we'd find
it when the 200,000th /32 was injected, and by that time it would be hard
to reverse course.)

so there's going to be a limit. the number of routes allowed into a router
shall be non-infinite. that limit can be selenforced in a number of ways:

  1. total number of routes
  2. total number of routes per peer
  3. prefix length

now, #1 would just be too unpredictable. which routes were present would
depend on what order the peers came up. so, one day, some routes, the next
day, some different routes.

#2 is in fact in wide use, but as an error ceiling to keep a peer from
accidentally sending you a full table rather than as a way to apportion
a router's resources. in a world which (clearly!) wants me to store 200,000
/32's, though, it's difficult to imagine how #2 helps prevent this.

#3 is a gross hack which happens to have the nice properties of "predictable"
(one day's routes will be much like another day's routes) and "minimal impact"
(the long prefixes i'm throwing away almost always have shorter "covering"
routes). and, #3 is negotiable per peer. abovenet allows its peers to send
long prefixes because this makes "longest exit" possible. (this might or might
not cause abovenet some scalability problems, but it's their decision and will
impact noone except them and their customers.)

some of you know that i had some involvement in both MIBH and abovenet and
may be thinking of asking why these networks had different prefix filtering
policies. two answers: MIBH wasn't a global network so "longest exit" wasn't
going to be possible in any case; and MIBH didn't own any GSR's.

If the thought is "you're given a /20, i only want to see a /20.." then
why doesn't "you're given a /18, I only want to see a /18" also apply?

because there's got to be some limit. if you want to set that limit at
"you were given a /24 so i don't want to see any /28's from you", then fine.
but obviously some trial and error has gone on here, and the folks who have
prefix filters have set them so that (a) any longer and there'd be too many
routes, and (b) any shorter and there'd be too much nonreachability. this
is one of nature's equilibriums. it has shifted back and forth over time.
the filter lengths verio is using are obviously in verio's best interests or
they would have different ones.

I don't have any hard evidence to know how much of an impact this
actually has, but I would be very interested to see how many more specific
/19's and /20's exist in a "verio-filtered" table that were allocated as
/16's and shorter.

i'm pretty sure verio has a looking-glass instance, so you can find out.

I don't have any hard evidence to know how much of an impact this
actually has, but I would be very interested to see how many more specific
/19's and /20's exist in a "verio-filtered" table that were allocated as
/16's and shorter.

i'm pretty sure verio has a looking-glass instance, so you can find out.

<http://psg.com/~randy/010521.nanog>

any plans to follow up with long term data? of particular interest would be
rate of growth: routing table as a whole vs prefixes allowed by the filters
vs prefixes blocked by the filters. also interesting would be a more
thorough data analysis (e.g. what portion of prefixes blocked by the filters
are part of some aggregate already in the table/what portion actually
represent loss of reachability) i'm sure the ubiquitous NANOG cheap peecee
hardware(TM)[1] with reasonable code could blow through the data in a few
seconds.

we know filtering == smaller table, but what i (and maybe somebody else)
really want to know is does filtering (in reality, not theory) == slower
table growth? imho, the latter is of considerably higher strategic usefulness.

forgive me if these questions have been asked/answered, but i missed it in
the mire.

1. that hardware which we so often like to compare our routers to in terms
   of memory/processor power. (not an actual product of NANOG or lart.net or
   any other particular entity that i may or may not be associated with)

party on,
sam

A limit is needed, but the filtering method in question to me essentially
says this:

if you have 64.x.x.x/15, slice it into as many /20's as you can and bloat
as much as you want.. we feel this is an acceptable practice.

Yet, if you're a legitimately multihomed customer wants to push out a
single /24 (1 AS, 1 prefix) that is not considered acceptable.

The only kind of prefix filtering I would want to implement is something
that can accomplish:

1. Define threshold, say /20 or /18 or hell even /8.
3. all prefixes longer than threshold get held until entire tables are
loaded
3. start looking at the longer prefixes across the entire ipv4 space
starting with the longest and finishing at threshold+1
4. if prefixes longer than threshold appear as part of a larger aggregate
block that *originate* from the same AS, drop.
5. if prefixes longer than threshold originate from a different AS than
the aggregate, accept.

This way I could get rid of redundant information yet at the same time not
cause any trouble to smaller multihomed customers. I'm not saying that we
should allow /32's to be pushed everywhere either. As you said there has
to be a limit, and /24 seems to be a pretty good one if something along
the lines of the above mentioned filtering algorithm could be used.

I'm sure in reality there's many reasons this would not be able to be
implemented (CPU load perhaps) but it would atleast do something more than
a "gross hack" that nails some offenders, not all by any means, and
impacts multihomed customers who are only a portion of the problem that
the current prefix filtering solution does not solve.

paul

> there has to be a limit.

A limit is needed, but the filtering method in question to me essentially
says this:

if you have 64.x.x.x/15, slice it into as many /20's as you can and bloat
as much as you want.. we feel this is an acceptable practice.

i strongly doubt that the policy was formulated on that basis. it may or
may not be equivilent to what you said, but you have not described anyone's
(whom i'm aware of) actual motivations with the above formulation.

Yet, if you're a legitimately multihomed customer wants to push out a
single /24 (1 AS, 1 prefix) that is not considered acceptable.

actually there's a loophole. nobody filters swamp /24's that i know of, since
so much of the oldest salty crusty layers of the internet are built on those.

The only kind of prefix filtering I would want to implement is something
that can accomplish:

1. Define threshold, say /20 or /18 or hell even /8.
3. all prefixes longer than threshold get held until entire tables are
loaded
3. start looking at the longer prefixes across the entire ipv4 space
starting with the longest and finishing at threshold+1
4. if prefixes longer than threshold appear as part of a larger aggregate
block that *originate* from the same AS, drop.
5. if prefixes longer than threshold originate from a different AS than
the aggregate, accept.

i wish you luck in implementing this proposal. i think that folks with
multivendor global networks will find it completely impractical, but you
can probably pull it off in a regional zebra-based network with no problem.

This way I could get rid of redundant information yet at the same time not
cause any trouble to smaller multihomed customers. I'm not saying that we
should allow /32's to be pushed everywhere either. As you said there has
to be a limit, and /24 seems to be a pretty good one if something along
the lines of the above mentioned filtering algorithm could be used.

let's do some math on this. swamp space is more or less 192/8 and 193/8
(though parts of other /8's were also cut up with a pretty fine bladed knife).
if every 192.*.*/24 and 193.*.*/24 were advertised, that would be more prefixes
than the entire current table shown in tony bates' reports (~100K vs 128K).

that is of course just the existing swamp. and it would be hard to handle
but even harder to prevent since there's no real way using today's routers to
say "accept the current /24's in 192/8 and 193/8 but don't allow new ones".
this is the bogey man that gives people like smd nightmares.

then there's everything else. if 20 /8's were cut up into /24's then tony
bates' report would have 1.3M more things in it than are there today. if
the whole IPv4 space were cut up that way then we'd see 16M routes globally.

those numbers may seem unreasonable, either because current routers can't
hold them, or because current routing protocols would never be able to
converge, or because you just can't imagine humanity generating even 1.3M
/24's let alone 16M of them.

multihoming is a necessary property of a scalable IP economy. actually,
provider independence is necessary, multihoming is just a means to that end.
if you don't think there are more than 1.3M entities worldwide who would pay
a little extra for provider independence, then you don't understand what's
happened to *.COM over the last 10 years. in that case i'll simply ask you
to take my word for it -- you make 1.3M slots available, they'll fill up.

i do not know the actual limit -- that is, where it ends. i know it's going
to be higher than 1.3M though. i also know that the limit of humanity's
desire for "provider independence without renumbering" (or "multihoming") is
currently higher than what the internet's capital plant, including budgetted
expansions, can support. and i strongly suspect that this will remain true
for the next 5..10 years.

I'm sure in reality there's many reasons this would not be able to be
implemented (CPU load perhaps) but it would atleast do something more than
a "gross hack" that nails some offenders, not all by any means, and
impacts multihomed customers who are only a portion of the problem that
the current prefix filtering solution does not solve.

people are out there building networks using available technology. forget
about CPU load and look at delta volume and convergence. the "internet
backbone" is not fat enough to carry the amount of BGP traffic that it would
take to represent the comings and goings of 16M prefixes. 1.3M is probably
achievable by the time it comes due for natural causes. do any of our local
theorists have an estimate of how much BGP traffic two adjacent core nodes
will be exchanging with 1.3M prefixes? is it a full DS3 worth? more? less?

every time you change out the capital plant on a single global AS core in
order to support some sea change like 10Gb/s sonet or 200K routes, it costs
that AS's owner between US$200M and US$1B depending on the density and
capacity. bean counters for old line telcos used to want a 20 year payback
(depreciation schedule) on investments of that order of magnitude. today
a provider is lucky to get five years between core transplants. bringing
the period down to two to three years would cause "the internet" to cost more
to produce than its "customers" are willing to pay.

so in the meanwhile, verio (and others who aren't mentioned in this thread)
are using the technology they have in order to maximize the period before
their capital plant becomes obsolete. as i said in a previous note, they are
certainly balancing their filters so that filtering more would result in too
many customer complaints due to unreachability, but filtering less would
result in too many customer complaints due to instability.

anyone who wants the point of equilibrium to move in the direction of "more
routes" should be attacking the economies which give rise to the problem
rather than attacking the engineering solutions which are the best current
known answer to the problem. in other words go tell cisco/juniper/whomever
your cool idea for a new routing protocol / route processing engine / cheap
OC768-capable backplane and maybe they'll hire you to build it for them.

Date: Sat, 29 Sep 2001 14:58:09 -0400 (EDT)
From: Paul Schultz <pschultz@pschultz.com>

if you have 64.x.x.x/15, slice it into as many /20's as you can
and bloat as much as you want.. we feel this is an acceptable
practice. Yet, if you're a legitimately multihomed customer
wants to push out a single /24 (1 AS, 1 prefix) that is not
considered acceptable.

Right.

The only kind of prefix filtering I would want to implement is
something that can accomplish:

[ snip ]

An interesting thought. Group BGP adverts / table updates by
prefix length... get connectivity up and going, then chew on
the smaller details as needed. Sort of like real-time process
priorities; if you can get there, queue longer prefixes until
_after_ all others have been processed.

This way I could get rid of redundant information yet at the
same time not cause any trouble to smaller multihomed
customers. I'm not saying that we should allow /32's to be
pushed everywhere either. As you said there has to be a limit,
and /24 seems to be a pretty good one if something along
the lines of the above mentioned filtering algorithm could be
used.

Seems to me that "saving the Internet" means strict ingress
filtering[1] of downstreams and strict egress filtering[2] to
peers and upstreams... which is pretty much the opposite of what
Verio does.

[1] Providers SHOULD filter/aggregate downstream routes, unless
    there's some overriding reason not to. There's enough bad
    BGP that trusting Joe Provider to do things right scares me.
    (I'm no <insert favorite NANOG routing superhuman guru>
    myself, but at least I know enough to speak decent BGP and to
    "tune" things.)

[2] Want to tune inbound traffic? Fine... advertise those longer
    prefixes to your upstreams/peers. But don't make the rest of
    the Internet suffer. Communities good. Extra routes bad.

I'm sure in reality there's many reasons this would not be able
to be implemented (CPU load perhaps) but it would atleast do
something more than a "gross hack" that nails some offenders,
not all by any means, and impacts multihomed customers who are
only a portion of the problem that the current prefix filtering
solution does not solve.

Filter/aggregate as close to origination as possible.

"Be conservative in what you send, and liberal in what you
receive." Haven't I heard that somewhere before? (Bonus points
for anyone who can name the RFC without wimping out and using a
search like yours truly alas had to do.)

Eddy

Date: 29 Sep 2001 12:39:27 -0700
From: Paul Vixie <vixie@vix.com>

[ snip ]

anyone who wants the point of equilibrium to move in the
direction of "more routes" should be attacking the economies

"More routes" is too simplistic, at least for the "near
future". "A greater number of useful routes" is what I think
people are supporting.

Given your point about many companies wanting to multihome, I
agree that we can easily exceed 1M routes. See suggestion #3
below.

Of course, there are screwballs such as someone who comes to mind
who _claims_ OC-48 connectivity (not colo's bandwidth, but their
own OC-48 line)... yet is single-homed. Supposedly they are so
happy with their upstream that they have no desire to multihome.
Frankly, I'd rather have tons of OC-3 to diverse backbones, but
my point is that not everyone wants to multihome.

How many _should_ want to? Most everyone. How _many_ do? I
don't have the answer.

which give rise to the problem rather than attacking the
engineering solutions which are the best current known answer
to the problem. in other words go tell cisco/juniper/whomever
your cool idea for a new routing protocol / route processing
engine / cheap OC768-capable backplane and maybe they'll hire
you to build it for them.

1. PI microallocations (e.g. /24) aligned on /19 (for example)
   boundaries. Need more space? Grow the subnet. One advert
   because IP space is contiguous.

   Cost: Change of policy at RIRs.

2. Responsibility for spam finds it way to the originating
   network. Why not filtering and aggregation? (No flame wars
   please... mention of spam is an analogy, not a desire to
   bring back certain flame wars after such a short while.)

   Cost: Individual responsibility and interacting with adjacent
   ASNs.

3. I'd suggest merging "best" routes according to next-hop, but
   the CPU load would probably be a snag. Flapping would
   definitely be a PITA, as it would involve agg/de-agg of
   netblocks. Maybe have a waiting period before agg/de-agg when
   a route changes... after said wait (which should be longer
   than the amount of time required to damp said route), proceed
   with netblock consolidation.

   I'm mulling some refinements to this, which I'll bring up if
   the discussion takes off. (Good idea, bad idea, flame war, I
   really don't care... if we eventually make progress, that's
   what counts.)

   Cost: Anyone care to estimate the resources required? Any
   good algorithms for merging subnets?

Feel free to flame me for any oversights. <excuse>I'm attempting
to multitask</excuse> and am well aware that I may have omitted
something.

Eddy

> The only kind of prefix filtering I would want to implement is
> something that can accomplish:

[ snip ]

An interesting thought. Group BGP adverts / table updates by
prefix length... get connectivity up and going, then chew on
the smaller details as needed. Sort of like real-time process
priorities; if you can get there, queue longer prefixes until
_after_ all others have been processed.

assuming you're router can store, process and switch against n x million
routes in real time .. suspect technology will take us there but you want
to try and influence people to hold back, else we'll all suffer on the day
the internet reaches critical mass and routes overtake technology and all
providers routers give up!

Seems to me that "saving the Internet" means strict ingress
filtering[1] of downstreams and strict egress filtering[2] to
peers and upstreams... which is pretty much the opposite of what
Verio does.

[1] Providers SHOULD filter/aggregate downstream routes, unless

Two different subjects? Filter definitely, you want to ensure quality and
sanity. But aggregate... hmm, dont think that'll work with commerical
people. A customer multihomes and you aggregate whilst a.n.other
doesnt.. a.n.other gets all the traffic and you become the secondary
provider and let a.n.other get all the new business as primary!

[2] Want to tune inbound traffic? Fine... advertise those longer
    prefixes to your upstreams/peers. But don't make the rest of
    the Internet suffer. Communities good. Extra routes bad.

but people dont advertise long prefixes in order to simply make use of two
providers for the sake of it, they do it in order to create their own
unique routing policies which by definition needs to be internet-wide

i would envisage all kinds of problems too where the aggregating upstream
accepts your specific routes via another isp by mistake and then your
transit traffic ends up going all round the place.. you'd be advertising
/24s to peers and all but one transit, with primary transit aggregating up
to /16 or whatever, feels bad..

> I'm sure in reality there's many reasons this would not be able
> to be implemented (CPU load perhaps) but it would atleast do
> something more than a "gross hack" that nails some offenders,
> not all by any means, and impacts multihomed customers who are
> only a portion of the problem that the current prefix filtering
> solution does not solve.

as i say, only one transit can aggregate, the other one cant unless they
both assign blocks and one uses nat.. but then it gets ugly and eats up
IPs

Steve

Date: Sat, 29 Sep 2001 21:09:49 +0100 (BST)
From: Stephen J. Wilcox <steve@opaltelecom.co.uk>

> [1] Providers SHOULD filter/aggregate downstream routes, unless

Two different subjects? Filter definitely, you want to ensure
quality and sanity. But aggregate... hmm, dont think that'll
work with commerical people. A customer multihomes and you
aggregate whilst a.n.other doesnt.. a.n.other gets all the
traffic and you become the secondary provider and let a.n.other
get all the new business as primary!

Punch holes in aggs for multihoming, same as now. Maybe I should
clarify... I was referring to splitting netblocks for the purpose
of tuning traffic.

> [2] Want to tune inbound traffic? Fine... advertise those longer
> prefixes to your upstreams/peers. But don't make the rest of
> the Internet suffer. Communities good. Extra routes bad.

but people dont advertise long prefixes in order to simply make
use of two providers for the sake of it, they do it in order to

IGP-into-BGP causes this, and is hardly for preferring traffic
from one upstream.

create their own unique routing policies which by definition
needs to be internet-wide

Tag a single netblock with a community or MED. Don't split it
into two longer prefixes. Of course, that might require inter-AS
cooperation.

i would envisage all kinds of problems too where the aggregating
upstream accepts your specific routes via another isp by
mistake and then your transit traffic ends up going all round
the place.. you'd be advertising /24s to peers and all but one
transit, with primary transit aggregating up to /16 or
whatever, feels bad..

Hmmmm. So I have customer X, who also connects to backbone B.
They advert several blocks, which I agg to 192.168.0.0/19. B
does not agg the blocks... but I'd also agg what I hear from
B, into the same /19. No problem here.

Or perhaps a hack... match ^[0-9]*_ and prefer it over longer
prefixes. i.e., if you can get there directly, it's better than
going through another AS.

Your point definitely merits thought... but I'm not sure that
it's insurmountable.

Eddy

Frankly, I'd rather have tons of OC-3 to diverse backbones, but
my point is that not everyone wants to multihome.

That's what the people who killed A6 said, too. However, there's this
other question: how many more people would multihome if it weren't so
difficult to get portable address space? (Hint: see *.COM.)

<sarcasm>
#4 most-used/least-complaining

Analyze NetFlow to determine where 80-90% of your traffic
(by volume or flow-count) originates/terminates, build a
filter to only accept those routes (or ASN's). Your route
table would probably be 20% what it is now. Customers would
fill out a Web form to request routing to filtered blocks,
and the most popular requests would be added.

You could do this for customer routes as well, so you only
advertise the top 80-90% of customer routes, and tell the
rest of your customers they don't use enough bandwidth to
justify a route table entry.
</sarcasm>

Companies are paying money to get more, reliable bandwidth.
There is value-add created by multi-homing that people are
willing to pay for. Those who stand to most benefit from
this demand are not the ma-and-pa ISPs, it's the same big
companies who already own most of the market. Anything that
stops the demand hurts them more than anyone else.

Asymmetric filtering ("I won't accept yours, but expect you
to accept mine") certainly get's people's attention, and if
everyone filtered it would definitely make the problem go
away, at least temporarily.

But the market /will/ be satisfied, you can't permanently
deny the demand for more, reliable bandwidth. Maybe before
you could stop or slow down the train, I think it's likely
now it'll run you over. As long as there isn't a solution
that solves the problem, we'll keep having these same
discussions, with different 'stupid' work-arounds.
Asymmetric filtering does not solve the problem the market
is willing to pay to solve. Progressive dampening and Multi6
look more promising.

Consider where we would be now if the solution to
address-space depletion was 'don't assign any more
addresses'. Thank goodness for a solution (CIDR) that
accomodated the demand in a way that was tolerable for most
people. It wasn't exactly what the customer wanted (their
own portable Class A), but it solved the problem well enough
for most people.

What solution will be analogous to CIDR in this situation?

Pete.

<http://psg.com/~randy/010521.nanog>

any plans to follow up with long term data? of particular interest would be
rate of growth: routing table as a whole vs prefixes allowed by the filters
vs prefixes blocked by the filters.

<http://psg.com/~randy/010809.ptomaine.pdf>

randy

my point is that not everyone wants to multihome.

That's what the people who killed A6 said, too.

no. they said

  o it is not clear that a6 helps stable scalable multi-homing

  o no one could show a plan for stable scalable multi-homing other than
    current v4 style, which is weak on the scalable

  o and it would be a damned shame if a mediocre dns hack limited the
    design space for stable scalable multi-homing solutions

randy

> > [2] Want to tune inbound traffic? Fine... advertise those longer
> > prefixes to your upstreams/peers. But don't make the rest of
> > the Internet suffer. Communities good. Extra routes bad.
>
> but people dont advertise long prefixes in order to simply make
> use of two providers for the sake of it, they do it in order to

IGP-into-BGP causes this, and is hardly for preferring traffic
from one upstream.

> create their own unique routing policies which by definition
> needs to be internet-wide

Tag a single netblock with a community or MED. Don't split it
into two longer prefixes. Of course, that might require inter-AS
cooperation.

but if only one provider is agg'ing it will always prefer the most
specific route regardless of tags

> i would envisage all kinds of problems too where the aggregating
> upstream accepts your specific routes via another isp by
> mistake and then your transit traffic ends up going all round
> the place.. you'd be advertising /24s to peers and all but one
> transit, with primary transit aggregating up to /16 or
> whatever, feels bad..

Hmmmm. So I have customer X, who also connects to backbone B.
They advert several blocks, which I agg to 192.168.0.0/19. B
does not agg the blocks... but I'd also agg what I hear from
B, into the same /19. No problem here.

no problem for you, but B's other peers wont agg and they will only see
B's more specific path as valid

Or perhaps a hack... match ^[0-9]*_ and prefer it over longer
prefixes. i.e., if you can get there directly, it's better than
going through another AS.

Your point definitely merits thought... but I'm not sure that
it's insurmountable.

i see where youre coming from but its not a part of bgp4. you could write
it into bgp5, trouble is theres a lot of routes out there less specific
and when you turn on this feature it could be very unpredictable.

the other issue is you dont really want to prefer any route, you want them
equally valid and how can you do that unless all the upstreams apply the
same tags/rules

theres only 20000 AS's, and its the AS that is defined as having a single
routing policy not the prefixes. why cant internet routing be based on AS
announcements, yeah i know you need to rewrite everything but i couldnt
think of a better idea. :slight_smile:

Steve

Smaller isps tend to not have consistent route announcements
to all their upstreams. one path gets prepended while the other gets
a different set of prefixes advertized due to the routes being more
'regionally close' within their own network, or for other policy
reasons. ie: prefix with 5m traffic gets adverted out a link where
they want to use a fixed-price transit circuit instead of a
burst-measured circuit that has a higher per-mb/s cost.

let me say that not only
am I a strong supporter of filtering, I have also suggested
fairly seriously to some registry-types that it is fair to allocate
individual /32s as necessary to contain address consumption.

So how is this supposed to work? For instance, I get a /27 and an AS
number, and I want to multihome. But nobody will listen to my
announcements. This is not a workable solution.

It is possible to take the position that the responsibility of the ISPs to
filter and the responsibility of the RIRs to assign are completely
unrelated, but that only holds in theory. In practice, people want to get
addresses they can use and use the addresses they can get. So there should
be a reasonable overlap.

Multihomers generally announce just a single route and there are less than
25k AS numbers so the majority of routes is NOT from multihomers so it
seems somewhat harsh to effectively forbid multihoming.

But while we're all discussing drafts on multi6, the routing table is
still growing so some filtering should be expected. Is there really no way
we can all agree on a filtering policy that keeps the routing table in
check but still leaves some room for responsible multihoming?

For instance: each AS gets to announce either a single route (regardless
of prefix size) or only RIR-allocation-sized blocks.

(The problem with this is that you can't make a reasonably sized filter
that enforces this policy, so you would have to trust your peers to some
degree.)

That is, the registries are correctly focusing on that resource-management,
and should spend energies on reclaiming wasted space (hello MIT!)
rather than on managing multiple scarce resources.

IP address space is only a scarce resource because it is allocated in huge
chunks. If we would be able to re-allocate individual un-used IP
addresses, we wouldn't run out for a _very_ long time.

Iljitsch van Beijnum

Iljitsch van Beijnum wrote:

> let me say that not only
> am I a strong supporter of filtering, I have also suggested
> fairly seriously to some registry-types that it is fair to allocate
> individual /32s as necessary to contain address consumption.

So how is this supposed to work? For instance, I get a /27 and an AS
number, and I want to multihome. But nobody will listen to my
announcements. This is not a workable solution.

Hello;

   I actually have some information on this - look at
http://www.multicasttech.com/status/cidr.html and
http://www.multicasttech.com/status/histogram.cidr.bgp

Of the announcements we receive (we are multihomed to 3 ISP's, but not to Verio),
57.6 % of the prefixes are /24's, and about 1/2 of these are holes in another address block.
Presumably, the other 1/2 are mostly /24's from the swamp. So, a Verio like filtering policy would
filter out about 1/2 of the /24's (a little more, actually) and leave the rest.

This seems somewhat arbitrary to me. By contrast, only ~ 0.1% of the announcements are /25's or longer.
SO, in the spirit of setting the speed limit at the speed people actually drive, I would suggest
that a reasonable solution would be to admit up to /24's.

I know that this will not be an entirely popular opinion.