icmp rpf

A smaller North American network provider, with a modest North
American backbone, numbers their internal routers on public IP space
that they do not announce to the world.

One of the largest North American network providers filters/drops
ICMP messages so that they only pass those with a source IP
address that appears in their routing table.

As a result, traceroutes from big.net into small.net have numerous
hops that time out.

Traceroutes from elsewhere that go into small.net but return on
big.net also have numerous hops that time out.

We do all still think that traceroute is important, don't we?

If so, which of these two nets is unreasonable in their actions/policies?

Please note that we're not talking about RFC1918 space, or reserved IP
space of any kind. Also, think about the scenario where some failure
happens leaving big.net with an incomplete routing table, thus breaking
traceroute when it is perhaps most needed.

Thanks,
-mark

The non-announcers, because they're also breaking PMTUD.

Regards,
Mark.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Kent wrote:

A smaller North American network provider, with a modest North
American backbone, numbers their internal routers on public IP space
that they do not announce to the world.

One of the largest North American network providers filters/drops
ICMP messages so that they only pass those with a source IP
address that appears in their routing table.

As a result, traceroutes from big.net into small.net have numerous
hops that time out.

Traceroutes from elsewhere that go into small.net but return on
big.net also have numerous hops that time out.

We do all still think that traceroute is important, don't we?

If so, which of these two nets is unreasonable in their actions/policies?

Please note that we're not talking about RFC1918 space, or reserved IP
space of any kind. Also, think about the scenario where some failure
happens leaving big.net with an incomplete routing table, thus breaking
traceroute when it is perhaps most needed.

Thanks,
-mark

- --------------------------
This is yet another reason one shouldn't rely on pings & traceroutes to
perform reachability analysis.

regards,
/virendra

virendra rode wrote:

This is yet another reason one shouldn't rely on pings & traceroutes to
perform reachability analysis.

So, you're in the "traceroute is not important" camp?
(you'll note that in my email I did ask whether we think
traceroute is important)

Mark Smith wrote:

The non-announcers, because they're also breaking PMTUD.

Really? How? Remember, we're not talking about RFC1918 space,
where there is a BCP that says we should filter it at the edge.
We're talking about public IP space, that just doesn't happen to be
announced outside of a particular AS.

Thanks,
-mark

If the intent is to prevent folks from reaching out and touching random network infrastructure devices directly whilst still allowing traceroute to work, iACLs and/or using IS-IS as one's IGP and null-routing the infrastructure blocks at one's various edges achieves the same effect with less potential for breakage:

http://www.nanog.org/mtg-0405/mcdowell.html

Note that a good infrastructure addressing plan is a prerequisite for both of these methods.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mark Kent wrote:

virendra rode wrote:

This is yet another reason one shouldn't rely on pings & traceroutes to
perform reachability analysis.

So, you're in the "traceroute is not important" camp?
(you'll note that in my email I did ask whether we think
traceroute is important)

- ----------------------------
I'm sure its important. All I'm saying is, icmp can get rate-limited
(many times it does) which could possibly lead to packet loss and even
drops while traversing hops.

regards,
/virendra

[Can we all have a moment of silence for a useful, interesting, and on-topic post?]

A smaller North American network provider, with a modest North
American backbone, numbers their internal routers on public IP space
that they do not announce to the world.

One of the largest North American network providers filters/drops
ICMP messages so that they only pass those with a source IP
address that appears in their routing table.

As a result, traceroutes from big.net into small.net have numerous
hops that time out.

Traceroutes from elsewhere that go into small.net but return on
big.net also have numerous hops that time out.

We do all still think that traceroute is important, don't we?

If so, which of these two nets is unreasonable in their actions/policies?

Who said either was?

First: Your network, your rules. Don't expect others to play by your rules.

But more importantly, there is nothing that says two perfectly reasonable, rational "rules" cannot create a problem when intersecting in interesting ways.

But if forced, I'd say Small.Net gets my vote for needing correction. I see less "wrongness" in a networking running what is essentially loose RPF than a network who expects supposedly bogon-sourced packets to be forwarded. (One could argue that non-announced space is bogus.)

Just remember, I would only say that if pushed. Normally I would say neither is wrong.

Please note that we're not talking about RFC1918 space, or reserved IP
space of any kind. Also, think about the scenario where some failure
happens leaving big.net with an incomplete routing table, thus breaking
traceroute when it is perhaps most needed.

In such an instance, I would suggest Big.Net will have far, far larger problems than whether pings get returned from prefixes it can't reach anyway.

Hi Mark,

Mark Smith wrote:
>> The non-announcers, because they're also breaking PMTUD.

Really? How? Remember, we're not talking about RFC1918 space,
where there is a BCP that says we should filter it at the edge.
We're talking about public IP space, that just doesn't happen to be
announced outside of a particular AS.

When a router that can't shove a DF'd packet down a link because the
MTU is too small needs to create a ICMP Destination Unreachable, Packet
Too Big, Fragmentation Required, it needs to pick a source IP address
to use for that ICMP packet, which will be one of those assigned to the
router with the MTU problem (I'm fairly sure it's the IP
address assigned to the outgoing interface for this ICMP packet,
although I don't think it probably matters much). If an upstream
router, i.e. on the way back to the sender who needs to resend with a
smaller packet, is dropping these packets because they fail RPF, then
PMTUD breaks. The result might be connection timeouts at the sender, or
possibly after quite a while the sender might try smaller packets and
eventually they'll get through (I think Windows might do this). Either
way, bad end-user experience.

PMTUD as it currently works isn't ideal, as of course there isn't any
guarantee that these ICMP Dest Unreachables will get there even in a
"good" network. However, most of the time it works, where as in the
scenario you're presenting, it definately won't.

Regards,
Mark.

The non-announcers, because they're also breaking PMTUD.

If you're not sure what benefits PMTUD gives,
you might want to review this page:
http://www.psc.edu/~mathis/MTU/index.html

--Michael Dillon

[ Quotations have been reordered for clarity in the reply ]

If so, which of these two nets is unreasonable in their actions/policies?

I don't think either are *unreasonable* in what they've done. Both actions are prima facie reasonable but have an unforeseen synthesis that is undesirable.

One of the largest North American network providers filters/drops
ICMP messages so that they only pass those with a source IP
address that appears in their routing table.

This is clearly reasonable as part of an effort to mitigate ICMP based network abuse. In fact, I'd argue that it would probably be reasonable to drop any packet, not just ICMP, based on its absence from the routing table - conditioned on having a full, stable routing table.

A smaller North American network provider, with a modest North
American backbone, numbers their internal routers on public IP space
that they do not announce to the world.

Several people have mooted this as good practice on the basis routers do not need to be reachable (as an end system) except by legitimate managers of those routers (i.e. within the AS in question).

As a result, traceroutes from big.net into small.net have numerous
hops that time out.

Traceroutes from elsewhere that go into small.net but return on
big.net also have numerous hops that time out.

We do all still think that traceroute is important, don't we?

On balance, it's small.net that will have to change to rectify this. My argument would be:

1) Big.net's approach is wholly legitimate - deny spoofed packets transit.

2) Small.net's intentions are good, but they are pseudo-spoofing packets.

ICMP packets will, by design, originate from the incoming interface used by the packet that triggers the ICMP packet. Thus giving an interface an address is implicitly giving that interface the ability to source packets with that address to potential anywhere in the Internet. If you don't legitimately announce address space then sourcing packets with addresses in that space is (one definition of) spoofing.

On balance both are acting with good intent, but small.net haven't fully seen the consequences for ICMP in their scheme.

Please note that we're not talking about RFC1918 space, or reserved IP
space of any kind. Also, think about the scenario where some failure
happens leaving big.net with an incomplete routing table, thus breaking
traceroute when it is perhaps most needed.

Filtering ICMP is always dangerous. If you are going to do it you *must* understand the consequences both to yourself and to others, and also understand the consequences in both normal situations and all possible failure modes. (If I had a penny for every broken PMTU detection I'd seen because of someone's over eager filtering of ICMP...)

Who thinks it would be a "good idea" to have a knob such that ICMP error messages are always source from a certain IP address on a router?

For instance, you could have a "loopback99" which is in an announced block, but filtered at all your borders. Then set "ip icmp error source-interface loopback99" or something. All error messages from a router would come from this address, regardless of the incoming or outgoing interface. Things like PMTUD would still work, and your /30s could be in private space or non-announced space or even imaginary^Wv6 space. :slight_smile:

Note I said "error messages", so things like TTL Expired, Port Unreachable, and Can't Fragment would come from here, but things like ICMP Echo Request / Reply pairs would not. Perhaps that should be considered as well, but it is not what I am suggesting here.

Obviously there's lots of side effects, and probably unintended consequences I have not considered, but I think the good might out-weigh the bad. Or not. Which is why I'm offering it up for suggestion.

(Unless, of course, I get 726384 "you are off-topic" replies, in which case I withdraw the suggestion.)

Is there a BCP for "handling ICMP?"

I'm walking the Cisco certification path and they're quite vocal about
ICMP rate limiting over any kind of filtering on routers/switches.
I haven't read their firewall documentation so I'm not sure what they're
preaching for PIX/ASA.

(Yup, if I had a penny for every PMTU fix-by-unbreaking-ICMP-filtering
I've repaired over the last 10 years..)

Adrian

A smaller North American network provider, with a modest North
American backbone, numbers their internal routers on public IP space
that they do not announce to the world.

One of the largest North American network providers filters/drops
ICMP messages so that they only pass those with a source IP
address that appears in their routing table.

  I would hope they're doing it for more than just ICMP packets.
There are numerous nefarious uses of the network with unrouted/spoofed
addres space. Various hosts have done bad things (in the past)
if they get something like a SYN that appears to be from themselves.
Protecting ones customers from spoofed address DoS attacks and leaking
of unrouted IP space (1918 or otherwise) that isn't globally reachable
I would argue should be, or is a current best practice.

  The "good" packets that are dropped in this scenario are
sufficent limited (yes, pmtu and these cases of traceroutes, etc..)
but there are also well known solutions and workarounds to this as well.
It's still hard to get people to fix their "deny all icmp" policies that
some companies have that create troubles for others. I've had issues
accessing my own bank website in the past due to p-mtu issues. These
aren't places that are easily approachable to resolve the problem in
most cases.

As a result, traceroutes from big.net into small.net have numerous
hops that time out.

  Others have pointed out how this can be resolved by by
using different techniques and still protect the infrastructure. It
may be of value for small.net to look at it and see what applies
to them.

Traceroutes from elsewhere that go into small.net but return on
big.net also have numerous hops that time out.

We do all still think that traceroute is important, don't we?

  I agree traceroute is important and valuable. It's one
of the things I have asked people to send me in the past for debugging,
but isn't the sole source of debugging available. Other techniques
can be applied.

  Did big.net just turn this on, or has it been on for months/years
now?

  - jared

Patrick W. Gilmore wrote:

ICMP packets will, by design, originate from the incoming interface used by the packet that triggers the ICMP packet. Thus giving an interface an address is implicitly giving that interface the ability to source packets with that address to potential anywhere in the Internet. If you don't legitimately announce address space then sourcing packets with addresses in that space is (one definition of) spoofing.

Who thinks it would be a "good idea" to have a knob such that ICMP error messages are always source from a certain IP address on a router?

I do. I have suggested much the same in the past.

Jared Mauch wrote:

I would hope they're doing it for more than just ICMP packets.

yes, loose RPF, but I just care about ICMP.

I would argue should be, or is a current best practice.

OK, so I must have missed the memo :slight_smile:

Who among AS1239, AS701, AS3356, AS7018, AS209 does loose RPF
(not just strict RPF on single-homed customers)?

Did big.net just turn this on, or has it been on for
months/years now?

I'm pretty sure it's months and not years.
I've noticed it for a while, but it just recently drove me to the
point where I'd complain about it.

Thanks,
-mark

In response to this:

Mark Smith wrote:
>> The non-announcers, because they're also breaking PMTUD.

Really? How?

Mark Smith replied with two paragraphs, but it's not 100% clear to me
that he got the reason why I asked. I asked because his initial statement
boiled down to "numbering on un-announced space breaks PMTUD"...
but it doesn't, not by itself (which he later expanded).

It only does so in the presence of filtering.

I think this is an important point to make because of my interaction
with small.net. When I pointed out the timeouts they said that it was
because they don't announce the router IP addresses, which is true but
not the whole story. I mentioned that some providers in the past
numbered on rfc1918 space and traceroute still worked, so that alone
was not enough.

Then they said "it's because of the asymmetric path," and that also is
true, but again not enough. A large proportion of traffic is
asymmetric, but traceroute still works.

Then they gave me an explanation that rested on the fact that the
routers will not respond to pings because they are unannounced outside
of their world. That too is true, but irrelevant and I told them how
Jacobson's traceroute works and told them that *someone* was
dropping/filtering the return packets and I'ld like to know who/why.

They somewhat implied that it was my fault, and this situation was
unique to my net, so I used the big.net looking glass to show how the
same things happens from space not associated with my network.
(Yes, I should have done this from the outset.)

With that they asked big.net, and big.net said they filtered,
and that's where we are.

My point here is that it took me ten (10) emails with small.net to get
this information partly because the small.net support staff had notions
in their head premised on too simplistic statements like "numbering on
un-announced space breaks PMTUD."

I wanted to clear this up because this list is likely read by support
people at various networks, and it's pretty clear that not all of them
are well versed even on something as thoroughly discussed over the ages
as traceroute.

Thanks,
-mark

Once upon a time, Mark Kent <mark@noc.mainstreet.net> said:

I think this is an important point to make because of my interaction
with small.net. When I pointed out the timeouts they said that it was
because they don't announce the router IP addresses, which is true but
not the whole story. I mentioned that some providers in the past
numbered on rfc1918 space and traceroute still worked, so that alone
was not enough.

Not announcing their router interface IP space is not any type of
security. Anyone directly connected to them (customer or peer) could if
they wish statically route that IP space, and any such security would be
gone. Unless it is otherwise filtered, any customer with a default
route can reach their routers.

Nevertheless putting router interface ip address for your network
in one specific block is very effective as way to quickly get rid
of DoS attack on the router - you simply stop announcing that block but everything else on the network still works. And doing
tricks like having primary ip address which not important at all
(except for logging traffic actually destined to it) while secondary
ip on the same interface is really the one used for inter-connection
also works quite well.

Jared Mauch wrote:

I would hope they're doing it for more than just ICMP packets.

yes, loose RPF, but I just care about ICMP.

I would argue should be, or is a current best practice.

OK, so I must have missed the memo :slight_smile:

It's been all the rage. :slight_smile:

Who among AS1239, AS701, AS3356, AS7018, AS209 does loose RPF
(not just strict RPF on single-homed customers)?

I'm wondering why that is relevant.

If all those ASNs only filtered on ASN instead of prefix for customer announcements (for instance), would that mean no one should?

As a matter of fact, most ICMP-based attacks don't require spoofing of the source IP address. You do have to spoof the addresses in the "original datagram" included in the ICMP payload, though.

Kindest regards,