Who does source address validation? (was Re: what's that smell?)

Ok but real world calling. I have tried this and when customers find something
doesnt work on your network but it does on your competitor you make it work even
if that means breaking rules.

You've snipped the other comments from my email which goes on to say take any
RFC for a protocol eg POP, SMTP etc and look at whats actually being done with
it, most commonly look at how Microsoft have implemented it or what the big ISPs
are doing on their servers etc and you either tow the line or your service
suffers.

Steve

What services require transport of packets with RFC1918 source addresses across the public network?

I can think of esoteric examples of things it would be possible to do, but nothing that a real-world user might need (or have occasion to complain about).

Do you have experience of such breakage from your own customers? It would be interesting to hear details.

Joe

Ok but real world calling. I have tried this and when customers find
something
doesnt work on your network but it does on your competitor you make it
work even
if that means breaking rules.

What services require transport of packets with RFC1918 source
addresses across the public network?

I can think of esoteric examples of things it would be possible to do,
but nothing that a real-world user might need (or have occasion to
complain about).

Do you have experience of such breakage from your own customers? It
would be interesting to hear details.

  Loss of ICMP packets generated by links with endpoints numbered in RFC1918
space. Holes in traceroutes, broken PMTU detection.

  DS

Check the archives, its been covered every time this issue has come up...

   a. Intra-provider links using RFC1918 addresses and MTU changes/PMTU
discovery
   b. Traceroutes TTL exceeded packets across RFC1918 intra-provider links

People used to have lots of problems with @Home customers trying to access
their websites if their filtered RFC1918 addresses using large MTU
connected servers (i.e. non-ethernet). Ok, so @Home is out of business,
but I'm sure there are other similar cases which would break.

>Do you have experience of such breakage from your own customers? It
>would be interesting to hear details.

  Loss of ICMP packets generated by links with endpoints numbered in RFC1918
space. Holes in traceroutes, broken PMTU detection.

Why do those links have endpoints in RFC1918 space to begin with?

Alex

>
>> Such things REALLY _NEEED_ to be broken, and the sooner the better as
>> then perhaps the offenders will fix such things sooner too, because
>> they
>> are by definition already broken and in violation of RFC 1918 and good
>> common sense.
>
> Ok but real world calling. I have tried this and when customers find
> something
> doesnt work on your network but it does on your competitor you make it
> work even
> if that means breaking rules.

What services require transport of packets with RFC1918 source
addresses across the public network?

None afaik which is why they should be blocked - on ingress from customer links.
Dont get me wrong, I'm just sharing experience not ethics and saying we should
all adhere to the RFC but if you apply filters that assume others are also doing
so you may be surprised..

Without repeating myself or list archives its all very well strictly following
all the RFC guidelines and saying to tell the planet its Microsoft or @Home's
fault its not working but the customers really dont buy it and they will go
elsewhere and it mightnt be about corporate $$$s but those same $$$s pay your
wages and then it starts to hurt!

I can think of esoteric examples of things it would be possible to do,
but nothing that a real-world user might need (or have occasion to
complain about).

On a related issue (pMTU) I recently discovered that using a link with MTU <
1500 breaks a massive chunk of the net - specifically mail and webservers who
block all inbound icmp.. the servers assume 1500, send out the packets with DF
set, they hit the link generating an icmp frag, icmp is filtered and data
stops. Culprits included several major ISP/Telcos ... I'd love to tell the
customer the link is fine its the rest of the Internet at fault but in the end I
just forced the DF bit clear as a temp workaround before finally swapping out to
MTU 1500!

Do you have experience of such breakage from your own customers? It
would be interesting to hear details.

I did attempt strict ingress filtering at borders after a DoS some time ago, I
figured I'd disallow any non public addresses. I took it off within a day after
a number of customers found a whole bunch of things had stopped working...

Unfortunately I cant give you an example as this was a while back and I dont
have the details to hand.

But if anyone with an appreciable sized customer base wants to try implementing
such filters feel free to forward the customer issues to the list as references!

Steve

I'm not going to say what I think of these people in order to avoid
another semi-flame fest, but limit my comments to:

You can also get around this by making the first hop the one with the
lowest MTU. This is no fun for ethernet-connected stuff, but for dial-up
this is easy. Then this box will announce a smaller TCP MSS when the
connection is established and there aren't any problems.

Because some administrators are ignorant, clueless, or malicious. We don't
all have the luxury of saying, "It doesn't work on our network and it does on
our competitor's, and we could fix it if we wanted to at no significant harm
to us, but we won't because we are in the right."

  DS

>>>would be interesting to hear details.

>> Loss of ICMP packets generated by links with endpoints numbered in
>>RFC1918
>>space. Holes in traceroutes, broken PMTU detection.

>Why do those links have endpoints in RFC1918 space to begin with?
>
>Alex

  Because some administrators are ignorant, clueless, or malicious. We don't
all have the luxury of saying, "It doesn't work on our network and it does on
our competitor's, and we could fix it if we wanted to at no significant harm
to us, but we won't because we are in the right."

In that case you should not complain about 1918 space being used for say..
attacking you either. After all, it does work on the network of your
competitors.

ALex

My personal pet peeve is the opposite - we'll try to use pMTU, some provider
along the way sees fit to run it through a tunnel, so the MTU there is 1460
instead of 1500 - and the chuckleheads number the tunnel endpoints out of
1918 space - so the 'ICMP Frag Needed' gets tossed at our border routers,
because we do both ingress and egress filtering. It's bad enough when all
the interfaces on the offending unit are 1918-space, but it's really annoying
when the critter has perfectly good non-1918 addresses it could use as
the source... Argh...

Or equivalently, just nail the MSS size for off-site connections down to
512, and accept that you have to send 3 times as many packets as you probably
should. As far as I can tell from when pMTU *does* work because all parties
concerned actually use reasonable addresses and don't filter 'icmp frag needed',
you end up with one of 3 results most of the time:

1) You get a clear 1500 end-to-end.
2) You get an MTU of 1460 because of tunneling.
3) You end up racheted down to 576 because of some ancient IP stack someplace
(older versions of end-user SLIP/PPP are famous for this)

That's not terribly hard to overcome - allow icmp unreachables (from any source) in your acl, then deny all traffic from RFC 1918 addresses, then the rest of the ACL.

Combined with CAR (or CatOS QoS rate limiting) on icmp's, you end up with all the functionality, and almost none of the bogus traffic.

Amazingly enough, although there's a number of offenders in the 1918-numbered
tunnel category, we decided it was easier to just not worry about talking to
those provider's victi^H^H^H^H^Hcustomers(*). We got tired of watching all the
DDoS-backscatter ICMP that *also* shows up with 1918 addresses on it. When
those show up, it means that some provider didn't filter whoever was forging
our address *AND* some provider wasn't filtering the 1918-sourced ICMP. The
fact it's probably two different providers is enough to make you give up trying
to do something nice for the net and just go have too many beers instead.:wink:

/Valdis

(*) The problem usually tends to be self-correcting - the host that got bit
the most was our Listserv machine - and if outbound mail got hosed up for
TOO long, it would bounce, the victim would get unsubscribed, and no more
problems - at least till they manage to resubscribe. Life got much nicer
once I made sure the "You must now confirm your subscription" message was
long enough to always trigger a 'frag needed'. :wink:

Ah but what if the traffic is coming into you ie originating elsewhere coming
into you.. seems in that case the originator blocks the necessary icmps and they
then fail to send data into you.. my example where I saw this recently was for
inbound SMTP traffic.

Steve

CAR should not be used to rate-limit but instead use the MQC police command
which basically does the same thing. CAR is not going to be around much longer and is not being developed anymore:

Have a look at:
http://www.cisco.com/warp/public/105/cbpcar.html
http://www.cisco.com/univercd/cc/td/doc/product/software/ios122/122cgcr/fqos_c/fqcprt8/qcfmcli2.htm
for more information.

-Hank

Ok, I know how this manages to rile people up, but might I suggest that
you brought it upon yourself?

There is a time and a place for messages sourced from addresses to which
you cannot reply, and a time and place where those messages should not
exist. Obviously, a dns *QUERY* is not the place for a message which
cannot be returned. But what about an ICMP *RESPONSE*? Nothing depends
upon the source address of the IP header for operation, the original
headers which caused the problem are encoded in the ICMP message.

And yet people are so busy concerning themselves with this mythical "thing
which might break from receiving ICMP overlapping existing internal 1918
space", the extra 0.4% of bandwidth which might be wasted, and the
righteous feeling that they have done something useful, that they don't
stop to realize *THEY* are the ones breaking PMTU-D.

I'm sure we can all agree on at least the concept that sourcing packets
from an address which cannot receive a reply is at least potentially
useful, for example to avoid DoS against a critical piece of
infrastructure. Would it make people feel better if there was a specific
seperate non-routed address space reserved for "router generated messages
which don't want replies"? Why?

Even Windows 2000+ includes blackhole detection which will eventually
remove the DF bit if packets aren't getting through and ICMP messages
aren't coming back, something many unixes lack. But the heart of the
problem is that people still push packets like every one must include the
maximum data the MTU can support. Do we have any idea how much "network
suffering" is being caused by that damn 1500 number right now? Aside from
the fact that it is one of the worst numbers possible for the data, it
throws a major monkey wrench in the use of tunnels, pppoe, etc. Eventually
we will realize the way to go is something like "4096 data octets, plus
some room for headers", on a 4470 MTU link. But if the best reason we can
come up with is ISIS, the IEEE will just keep laughing.

</rant>

Even Windows 2000+ includes blackhole detection which will eventually
remove the DF bit if packets aren't getting through and ICMP messages
aren't coming back, something many unixes lack.

Wow, now I'm impressed. And what about the 1999 other versions of Windows?
This is hardly a new problem. Still, it's good that some people at least
make progress, even if very slowly.

But the heart of the
problem is that people still push packets like every one must include the
maximum data the MTU can support.

And why not?

Do we have any idea how much "network
suffering" is being caused by that damn 1500 number right now? Aside from
the fact that it is one of the worst numbers possible for the data, it
throws a major monkey wrench in the use of tunnels, pppoe, etc.

So don't use those.

Eventually
we will realize the way to go is something like "4096 data octets, plus
some room for headers", on a 4470 MTU link.

So what then if someone runs a secure tunnel over wireless over a PPPoE
over ADSL using mobile IPv6 that runs over a tunnel or two ad nauseum
until the headers get bigger than 374 bytes? Then you'll have your problem
right back. Might as well really solve it the first try.

One of the problems is that there is no generally agreed on and widely
available set of rules for this stuff. Setting the DF bit on all packets
isn't good, but it works. Using RFC1918 space to number your tunnel
routers isn't good, but it works. Filtering validating source addresses on
ingress is good, but hey, it doesn't work!

Making a good list of best practices (and then have people widely
implement them) might also go a long way towards showing concerned parties
such as the US administration that the network community consists of
responsible people that can work together for the common good.

But if the best reason we can
come up with is ISIS, the IEEE will just keep laughing.

Why is the IEEE laughing?

So what then if someone runs a secure tunnel over wireless over a PPPoE
over ADSL using mobile IPv6 that runs over a tunnel or two ad nauseum
until the headers get bigger than 374 bytes? Then you'll have your problem
right back. Might as well really solve it the first try.

  This is a problem that would be solved by everyone being
responsible and doing pmtud properly.

One of the problems is that there is no generally agreed on and widely
available set of rules for this stuff. Setting the DF bit on all packets
isn't good, but it works. Using RFC1918 space to number your tunnel
routers isn't good, but it works. Filtering validating source addresses on
ingress is good, but hey, it doesn't work!

  I think we're starting to get at the heart of the problem
but let me stick my neck out and say it:

  Registries (APNIC, ARIN, RIPE, usw) charge for ip addresses.
be it via a lease/registration fee, it's a per-ip charge that ISPs must
get via some means out of their subscribers. (Unless people
don't care about money that is). Back in the "days", one could
obtain ip addresses from Internic saying "i will not connect
to internet", "i intend to connect at some later date in a
year or two .. (or similar)", "i intend to connect now".

  People number out of 1918 space primarily for a few
reasons, be them good or not:

  1) Internal use
  2) Cost involved.. nobody else needs to telnet to my p2p
links but me, and i don't want to pay {regional_rir} for my
internal use to reduce costs
  3) "security" of not being a "publicly" accessible
network.

  This can break many things, pmtu, multicast and various
streaming (multi)media applications.

  With the past scare of "we'll be out of ip addresses by 199x"
still fresh in some peoples memories, they in their good consience decided
to also conserve ips via this method.

  The problem is not everyone today that considers themselves
a network operator understands all the ramifications of their current
practices, be they good or bad.

  Going into fantasy-land mode, if IPv6 addresses were instantly
used by everyone, people could once again obtain ips that could be
used for internal private use yet remain globally unique, therefore
allowing tracking back of who is leaking their own internal sources.

Making a good list of best practices (and then have people widely
implement them) might also go a long way towards showing concerned parties
such as the US administration that the network community consists of
responsible people that can work together for the common good.

  I agree here, I personally think that numbering your internal
links out of 1918 space is not an acceptable solution unless it's
behind your "natted" network/firewall and does not leak out.

  Perhaps some of those that are the better/brighter out there want
to start to write up a list of "networking best practices".

  Then test those "book smart" ccie/cne types with the information
to insure they understand the ramifications. a few good whitepapers
about these might be good to include or quiz folks on. i suspect
there's only a handful of people that actually understand the complete
end-to-end problem and all the ramifications involved as it is quite
complicated.

> But if the best reason we can
> come up with is ISIS, the IEEE will just keep laughing.

Why is the IEEE laughing?

  The implication is that IEEE will not change the 802.x specs
to allow larger [default] link-local mtu due to legacy interop
issues. imagine your circa 1989 ne2000 card attempting to process
a 4400 byte frame on your local lan. a lot of the "cheap" ethernet
cards don't include enough buffering to handle such a large frame
let alone the legacy issues involved.. and remember the enterprise
networks have a far larger number of ethernet interfaces deployed
than the entire internet combined * 100 at least. any change
to the spec would obviously affect them also.

  - jared

Traffic consists of more than tcp; setting your mtu low might get your tcp
traffic delivered but won't help inbound traffic using other protocols.

Mtu discrepancies must be dealt with in at least one of the following ways
if you don't want it to lead to fatally dropped packets:

1. Fragmentation must work. This applies to systems that don't use PMTUD
or use blackhole detection. (Some folks think it a good "security"
practice to drop fragments! Some nat boxes don't know what to do with
fragments when they arrive out of order - especially a non-initial
fragment before the first.)

2. PMTUD must work.

3. PMTUD blackhole detection must be used with operable fragmentation. (If
you have to fallback to this you're likely to suffer significant
performance hits.)

Tony Rall

[People using RFC 1918 addresses for routers that terminate tunnels which
breaks path MTU discovery when RFC 1918 source addresses are filtered
elsewere.]

  People number out of 1918 space primarily for a few
reasons, be them good or not:

  1) Internal use
  2) Cost involved.. nobody else needs to telnet to my p2p
links but me, and i don't want to pay {regional_rir} for my
internal use to reduce costs

So use IP unnumbered.

  3) "security" of not being a "publicly" accessible
network.

Well then they get more security than they bargained for if their network
becomes inaccessible...

  With the past scare of "we'll be out of ip addresses by 199x"
still fresh in some peoples memories, they in their good consience decided
to also conserve ips via this method.

From where I'm sitting, getting IP addresses is largely a matter of

spending some time and energy, but after a while you get them. It seems
this is different for other people. For instance, an ISP here in NL gave
their premium ADSL a few addresses when they first started offering the
service, but later offered those customers a free ADSL router if they
returned the addresses. So obviously there must have been a pretty big
incentive for getting the address space back.

Another problem with numbering router links is that you need to break up
your address blocks. This is extremely annoying and wasteful.

  The problem is not everyone today that considers themselves
a network operator understands all the ramifications of their current
practices, be they good or bad.

Very true.

  Going into fantasy-land mode, if IPv6 addresses were instantly
used by everyone, people could once again obtain ips that could be
used for internal private use yet remain globally unique, therefore
allowing tracking back of who is leaking their own internal sources.

Ok, quick question: how do I number my point to point links in IPv6:

1. /64
2. /126
3. /127
4. IP unnumbered
5. just link-local addresses

I hate to say it, but I don't think IPv6 is ready for prime time yet.

> Making a good list of best practices (and then have people widely
> implement them) might also go a long way towards showing concerned parties
> such as the US administration that the network community consists of
> responsible people that can work together for the common good.

  I agree here, I personally think that numbering your internal
links out of 1918 space is not an acceptable solution unless it's
behind your "natted" network/firewall and does not leak out.

Agree.

  Perhaps some of those that are the better/brighter out there want
to start to write up a list of "networking best practices".

I've started with a list of BGP best practices recently. When I think it's
ready I'll post a link. If anyone has anything to contribute before then
(even just (contructive) criticism), mail me off-list.

> > But if the best reason we can
> > come up with is ISIS, the IEEE will just keep laughing.

> Why is the IEEE laughing?

  The implication is that IEEE will not change the 802.x specs
to allow larger [default] link-local mtu due to legacy interop
issues.

So? We don't stick to IEEE 802.3 anyway...

imagine your circa 1989 ne2000 card attempting to process
a 4400 byte frame on your local lan. a lot of the "cheap" ethernet
cards don't include enough buffering to handle such a large frame
let alone the legacy issues involved..

4400 bytes on a 1989 card, you are being _very_ optimistic to even take
the trouble of saying that doesn't work. Many of today's cards 100
Mbit cards (and that's not just the $10 ones) can't even handle 1504 bytes
as needed for 802.1q VLAN tags.

I have to side with the IEEE here: simply changing the spec isn't an
option, since none of the 10 Mbps stuff will handle it, very little of the
100 Mbps stuff and not even all of the 1000 Mbps stuff. (I once complained
to a vendor about this. They sent us new GE interfaces. Those did 64k
frames...)

Having a larger than 1500 byte MTU in backbones would be very good,
because then you have some room to work with when adding extra headers. A
good solution for this would be an neighbor MTU discovery protocol. Maybe
ARPv2? Then boxes with different MTUs can live together on the same wire
and doing more than 1500 bytes over an Ethernet-based public exchange
point wouldn't be a problem.