Tier 2 ingress filtering

1112 · March 29, 2013, 1:48am

An economic factor will be required for BCP38 to be effective.

It will have to cost more money to not implement BCP38 than it will to implement it, in order to get widespread adoption.

-Dan

Tore_Anderson1 · March 29, 2013, 12:31pm

* Saku Ytti

Question is, is it reasonable to expect customer to know what
networks they have. If yes, then you can ask them to create route
objects and then you can BGP prefix-filter and ACL on them. I do
both, and it has never been problem to my customers (enterprises,
CDNs, eyeballs).

I've had some problems with my upstream providers' ingress filtering,
for example:

- Traffic sourced from a prefix announced as a more-specific route at
transit connection in location A got filtered on a transit connection in
location B, where only a greater aggregate was announced.

- A GRE tunnel anchored in my routers' addresses in the eBGP link
network (part of my provider's address space) stopped working, as my
outbound packets was dropped by the provider's ingress filtering.

- Traceroutes that reaches my network through provider A show one
missing hop if my best return path back to the traceroute source is
through provider B, and provider B is doing ingress filtering. This is
because the ICMP TTL/HL exceeded packet is sourced from provider A's
address space (my router's interface address in the eBGP link net).

AFAIK, you represent one of my upstream providers, so sorry, but saying
your customers have never had problems with your ingress filtering isn't
entirely accurate. Everything works fine now, though.

Best regards,

William_Herrin · March 29, 2013, 6:49pm

I've had some problems with my upstream providers' ingress filtering,
for example:

- Traffic sourced from a prefix announced as a more-specific route at
transit connection in location A got filtered on a transit connection in
location B, where only a greater aggregate was announced.

Yep, I've heard of that. This is very bad behavior on your ISP's part.
Spank them and if need be, name and shame.

- A GRE tunnel anchored in my routers' addresses in the eBGP link
network (part of my provider's address space) stopped working, as my
outbound packets was dropped by the provider's ingress filtering.

Yep, I've encountered that. One of my providers decided that the IP on
the exterior address of my router should not reach the Internet. Bad
behavior. Spank, then name and shame.

- Traceroutes that reaches my network through provider A show one
missing hop if my best return path back to the traceroute source is
through provider B, and provider B is doing ingress filtering. This is
because the ICMP TTL/HL exceeded packet is sourced from provider A's
address space (my router's interface address in the eBGP link net).

This is a bug, if you will, in router design. It isn't just traceroute
that's missing a hop; if that router needed to send an ICMP
destination unreachable in support of path MTU detection for some pair
of hosts' TCP, the impacted TCP session would collapse. It gets even
worse if you want to configure a particular router link with RFC1918
addresses.

I've long thought router vendors should introduce a configuration
option to specify the IP address from which ICMP errors are emitted
rather than taking the interface address from which the packet causing
the error was received.

Regards,
Bill Herrin

Patrick · March 30, 2013, 3:04am

Concur. An 'ip(v6)? icmp source-interface loop0' sure beats running 'ip
unnumbered loop0' everywhere.

Alejandro_Acosta · March 30, 2013, 3:21am

Hi,

I've long thought router vendors should introduce a configuration
option to specify the IP address from which ICMP errors are emitted
rather than taking the interface address from which the packet causing
the error was received.

Concur. An 'ip(v6)? icmp source-interface loop0' sure beats running 'ip
unnumbered loop0' everywhere.

Why do you think it will be better?, can you explain?
So far I can only think in a more difficult troubleshooting if this
idea/feature gets spread.

I guess based in the scenario where the output interface can not reach
Internet sounds as a practical solution however for sure the output
interface is reachable inside the provider network.

Thks,

Alejandro,

William_Herrin · March 30, 2013, 5:07am

Hi Alejandro,

Consider the alternatives:

1. Provide a router configuration option (per router and/or per
interface) to emit ICMP error messages from a specified IP address
rather than the interface address.

2. At every border, kick packets without an Internet-legitimate source
address up to the slow path for network address translation to a
source address which is valid.

3. Design your network so that any router with at least one network
interface whose IP address is not valid on the Internet has exactly
the same MTU on every interface, and at least an MTU of 1500 on all of
them, guaranteeing that the router will never emit a
fragmentation-needed message. And do this consistently. Every time.

4. Redesign TCP so it doesn't rely on ICMP destination unreachable
messages to determine path MTU and get your new design deployed into
every piece of software on the Internet.

5. Accept that TCP will break unexpectedly due to lost
fragmentation-needed messages, presenting as a particularly nasty and
intermittent failure that's hard to track and harder to fix.

Which do you find least offensive?

Regards,
Bill Herrin

Saku_Ytti1 · March 30, 2013, 1:32pm

That sounds like uRPF, which you should not run towards your transit
customers.

I'm talking only about using ACL. And I stand-by that I've never had to fix
something that is broken.

Now naturally it has happened that my customer has gotten new prefix, and
things have been wonky, because they forgot to make route object, which
meant we didn't allow prefix nor allow it in ACL.
However, I think my customers prefer this. The alternative is that
everything works fine for 6month, until the other transit who does not BGP
filter goes down, after which the network stops propagating and everything
is down. At least with ACL you notice the problem immediately.

Jay_Ashworth · March 30, 2013, 3:39pm

Quite a number of people have responded to this post.

But no one's actually addressed my key question:

From: "Jay Ashworth" <jra@baylink.com>

In the current BCP38/DDoS discussions, I've seen a lot of people suggesting
that it's practical to do ingress filtering at places other than the
edge.

My understanding has always been different from that, based on the idea
that the carrier to which a customer connects is the only one with which
that end-site has a business relationship, and therefore (frex), the
only one whom that end-site could advise that they believe they have a
valid reason to originate traffic from address space not otherwise known to
the carrier; jack-leg dual-homing, for example, as was discussed in
still a third thread this week.

Here's the important part:

The edge carrier's *upstream* is not going to know that it's
reasonable for their customer -- the end-site's carrier -- to be originating
traffic with those source addresses, and if they ingress filter based on the
prefixes they route down to that carrier, they'll drop that traffic...

which is not fraudulent, and has a valid engineering reason to exist
and appear on their incoming interface.

An edge carrier, to whom an end-site connects, *can know* that that particular
end-site will need an exemption, for reasons like non-BGP dual-homing or load-
balancing.

But there's no way for an upstream transit carrier to know that *at the present
time*.

So, short of building another big RPKI infrastructure to authenticate it (which
isn't going to happen this decade) or getting lots of carriers to cooperate in
some other information exchange so they know where to poke extra holes (which
isn't going to happen this decade), is there any way at all to actually do
ingress filtering at the transit level -- as many here advocate -- without
throwing out the baby of *valid* unexpected source addressed packets with the
bathwater of *actual* fraud?

IP packets with source addresses that don't match the address space of the interface
you got them on are *not* a 100.0% accurate proxy for fraud and attacks.

Cheers,
-- jra

Saku_Ytti1 · March 30, 2013, 4:34pm

We expect our customers to mark any customers they have in their AS-SET.
And we filter BGP announcements and we ACL traffic based on that.

I know mandating strict IRR is not practical to everyone today. But for me,
it's practical. Sometimes I need to educate customers how to create route
object or AS-SET.

At least every non-stubby ASN facing stubby ASN should be able to do strict
IRR. This is about 6000 networks. Compared to other options:

1) close recursive name servers
  - even if all are closed, attack vector is virtually the same, as large
    RR can be found in arbitrary authorative due to DNSSEC
  - snmpbulkwalk
  - UDP du jour

2) implement uRPF at last mile
- hundreds of millions of ports, many of them running on autopilot, good
chunk of them will never ever support uRPF

Obviously if we could choose 2) it would be best, but we can't choose it.

Alejandro_Acosta · March 31, 2013, 2:17am

Hi William,
Thanks for your response, my comments below:

I've long thought router vendors should introduce a configuration
option to specify the IP address from which ICMP errors are emitted
rather than taking the interface address from which the packet causing
the error was received.

Concur. An 'ip(v6)? icmp source-interface loop0' sure beats running 'ip
unnumbered loop0' everywhere.

Why do you think it will be better?, can you explain?

Hi Alejandro,

Consider the alternatives:

1. Provide a router configuration option (per router and/or per
interface) to emit ICMP error messages from a specified IP address
rather than the interface address.

I imagine that and it sounds terrific. I guess at least this option
should come disabled by default.

2. At every border, kick packets without an Internet-legitimate source
address up to the slow path for network address translation to a
source address which is valid.

IMHO this can be achieved with the current behaviour.

3. Design your network so that any router with at least one network
interface whose IP address is not valid on the Internet has exactly
the same MTU on every interface, and at least an MTU of 1500 on all of
them, guaranteeing that the router will never emit a
fragmentation-needed message. And do this consistently. Every time.

If you have pmtud enabled you won't need this every time

4. Redesign TCP so it doesn't rely on ICMP destination unreachable
messages to determine path MTU and get your new design deployed into
every piece of software on the Internet.

You will have the same problem using only one output interface for
ICMP error/messages. Of course based in your comments you mean you
will need to troubleshoot this interface only once.

5. Accept that TCP will break unexpectedly due to lost
fragmentation-needed messages, presenting as a particularly nasty and
intermittent failure that's hard to track and harder to fix.

Same answer as in 3.

Which do you find least offensive?

None of them if offensive, I think this could be a nice feature to
have but I hope it's disable by default.

Regards,
Bill Herrin

Thanks,

Regards,
Alejandro Acosta,

William_Herrin · April 1, 2013, 2:19am

Hi Alejandro,

Also inline.

Hi William,
Thanks for your response, my comments below:

>>>> I've long thought router vendors should introduce a configuration
>>>> option to specify the IP address from which ICMP errors are emitted
>>>> rather than taking the interface address from which the packet causing
>>>> the error was received.
>>>
>>> Concur. An 'ip(v6)? icmp source-interface loop0' sure beats running 'ip
>>> unnumbered loop0' everywhere.
>>
>> Why do you think it will be better?, can you explain?
>
> Hi Alejandro,
>
> Consider the alternatives:
>
> 1. Provide a router configuration option (per router and/or per
> interface) to emit ICMP error messages from a specified IP address
> rather than the interface address.

I imagine that and it sounds terrific. I guess at least this option
should come disabled by default.

> 2. At every border, kick packets without an Internet-legitimate source
> address up to the slow path for network address translation to a
> source address which is valid.

IMHO this can be achieved with the current behaviour.

If you don't mind the router crashing. There's too much trash traffic
with bad source addresses, even when no one is under attack. You kick
it up to the main CPU, you overwhelm the router.

> 3. Design your network so that any router with at least one network
> interface whose IP address is not valid on the Internet has exactly
> the same MTU on every interface, and at least an MTU of 1500 on all of
> them, guaranteeing that the router will never emit a
> fragmentation-needed message. And do this consistently. Every time.

If you have pmtud enabled you won't need this every time

Clients effectively always enable path MTU discovery. If the ICMP
error message your router generates when it discovers the MTU problem
comes from an IP address that can't leave your system without being
filtered then path MTU discovery fails absolutely. That path is a
black hole that mysteriously swallows TCP connections.

If you can't prevent mis-addressed ICMP error messages from leaving
your system then you must prevent the conditions under which path MTU
discovery would cause an ICMP error message to be generated.
Practically speaking, that means you guarantee an MTU of at least 1500
bytes on every link and no endpoint MTU over 1500 bytes.

> 4. Redesign TCP so it doesn't rely on ICMP destination unreachable
> messages to determine path MTU and get your new design deployed into
> every piece of software on the Internet.

You will have the same problem using only one output interface for
ICMP error/messages. Of course based in your comments you mean you
will need to troubleshoot this interface only once.

With #4, path mtu discovery no longer relies on ICMP error messages.
The endpoint client would have to reduce the MSS on retransmission and
use the pattern of lost packets to acknowledgements to find path MTU
on paths with an ICMP black hole.

For the sake of robustness, this is something we probably should think
seriously about adding to the TCP protocol. However, that's a long
plan, two decades at least. It isn't something that could deliver a
complete mitigation in the next release of the software, the way
option 1 does.

> 5. Accept that TCP will break unexpectedly due to lost
> fragmentation-needed messages, presenting as a particularly nasty and
> intermittent failure that's hard to track and harder to fix.

Same answer as in 3.

See me response to #3.

Regards,
Bill Herrin