IPv6 day and tunnels

Well, IPv6 day isnt here yet, and my first casualty is the browser on the wife's machine, firefox now configured to not query AAAA.

Now www.facebook.com loads again.

Looks like a tunnel mtu issue. I have not as of yet traced the definitive culprit, who is (not) sending ICMP too big, who is (not) receiving them, etc.

www.arin.net works and worked for years. www.facebook.com stopped June 1.

So IPv6 fixes the fragmentation and MTU issues of IPv4 by how exactly?

Or was the fix incorporating the breakage into the basic design?

In IPv4 I can make tunneling just work nearly all of the time. So I have to munge a tcp mss header, or clear a df-bit, or fragment the encapsulated packet when all else fails, but at least the tools are there. And on the host, /proc/sys/net

In IPv6, it seems my options are a total throwback, with the best one turning the sucker off. Nobody (on that station) needs it anyways.

Joe

#1 don't tunnel unless you really need to.

#2 see #1

#3 use happy eyeballs, http://tools.ietf.org/html/rfc6555, Chrome has
a good implementation, but this does not solve MTU issues.

#4 MSS hacks work at the TCP layer and still work regardless of IPv4 or IPv6.

#5 According to the IETF, MSS hacks do not exist and neither do MTU
issues http://www.ietf.org/mail-archive/web/v6ops/current/msg12933.html

PSA time: Please use http://test-ipv6.com/ and pass this good advice
around to the people you know.

Thanks,

Cameron

Joe Maimon wrote:

Looks like a tunnel mtu issue. I have not as of yet traced the
definitive culprit, who is (not) sending ICMP too big, who is (not)
receiving them, etc.

The culprit is the v6 tunnel, which wanders into v4 ipsec/gre tunnels, which means the best fix is ipv6 mtu 1280 on the tunnels, and possibly on the hosts. PMTUD works fine, just comes up with the wrong answer.

1280, the new 1500.

Cameron Byrne wrote:

#1 don't tunnel unless you really need to.

Tunnels are ipv4 only now?

#2 see #1

#3 use happy eyeballs, RFC 6555 - Happy Eyeballs: Success with Dual-Stack Hosts, Chrome has
a good implementation, but this does not solve MTU issues.

Because the initial connections are made just fine.

PMTUD with probing should work, but does not seem to. Probably a (lack of) deployment issue.

#4 MSS hacks work at the TCP layer and still work regardless of IPv4 or IPv6.

But the equipment needs to support it. Again IPv6 lags.

#5 According to the IETF, MSS hacks do not exist and neither do MTU
issues [v6ops] 6204 bis and mtu

Thanks for that. I expect soon tunnels wont either.

PSA time: Please use http://test-ipv6.com/ and pass this good advice
around to the people you know.

Excellent site/tool.

Thanks,

Cameron

Thank you.

Joe

It doesn't fix the fragmentation issues. It assumes working PMTU.

For what it's worth, I also use a tunnel without issue to reach www.facebook.com via IPv6, with an MTU of 1476 (since it's running over a 1492 byte IPv4 PPoE tunnel...).

If facebook isn't working for you over a tunnel, and other sites are,
complain to the site.

If they don't let through ICMPv6 PTB then the site needs to add
"route change -inet6 change -mtu 1280" or equivalent to every box.

This isn't rocket science. If you choose to break PMTU discovery
then you can take the necessary steps to avoid requiring that PMTU
Discovery works. This is practical for IPv6. For IPv4 it is
impractical to do the same.

The IPv6 Advanced Socket API even has controls so that you can make
the PMTUD choice on a per socket basis.

Mark

actually, to be safe, 1220.

/bill

[..]

  actually, to be safe, 1220.

That will work really well with the minimum IPv6 MTU being 1280 :wink:

Greets,
Jeroen

[snip]

#5 According to the IETF, MSS hacks do not exist and neither do MTU
issues [v6ops] 6204 bis and mtu

They couldn't be more wrong. MTU issues still exist, and not just
with tunnelling,
but tunneling should be an expected scenario for IP.

The protocol IPv6 still handles it very poorly, by still requiring
external ICMP messages,
through the unreliable PTMUD scheme, matters are as bad if not worse
than with IPv4.

It's just so unfortunate that IPv6 couldn't provide a good solution
to one of IP's more troublesome deficiencies.

[snip]

#5 According to the IETF, MSS hacks do not exist and neither do MTU
issues [v6ops] 6204 bis and mtu

They couldn't be more wrong. MTU issues still exist, and not just
with tunnelling,
but tunneling should be an expected scenario for IP.

The protocol IPv6 still handles it very poorly, by still requiring
external ICMP messages,
through the unreliable PTMUD scheme, matters are as bad if not worse
than with IPv4.

As ICMPv6 is an integral part of IPv6 how exactly is ICMP "external"?
You do realize what the function of ICMP is I hope?

If one is so stupid to just block ICMP then one should also accept that one loses functionality.

If the people in the IETF would have decided to inline the headers that are ICMPv6 into the IPv6 header then there for sure would have been people who would have blocked the equivalent of PacketTooBig in there too. As long as people can block stuff they will block stuff that they should not have blocked, nothing the IETF can do about, stupidity exists behind the keyboard.

That said, pMTU discovery works awesomely in the 10+ years that I have been actively been using IPv6, if it does no work for you, find the issue and resolve it. (tracepath is a great tool for this btw)

It's just so unfortunate that IPv6 couldn't provide a good solution
to one of IP's more troublesome deficiencies.

Did you ever bother to comment about your supposed issue in the IETF?

Greets,
Jeroen

Joe Maimon wrote:

So IPv6 fixes the fragmentation and MTU issues of IPv4 by how exactly?

Completely wrongly.

Or was the fix incorporating the breakage into the basic design?

Yes.

Because IPv6 requires ICMP packet too big generated against
multicast, it is designed to cause ICMP implosions, which
means ISPs must filter ICMP packet too big at least against
multicast packets and, as distinguishing them from unicast
ones is not very easy, often against unicast ones.

For further details, see my presentation at APNIC32:

  http://meetings.apnic.net/32/program/apops
  How Path MTU Discovery Doesn't work
  Masataka Ohta

In IPv4 I can make tunneling just work nearly all of the time. So I have
to munge a tcp mss header, or clear a df-bit, or fragment the
encapsulated packet when all else fails, but at least the tools are
there. And on the host, /proc/sys/net

FYI, IETF is trying to inhibit clearing DF bit explicitly with

  draft-ietf-intarea-ipv4-id-update-05.txt
  >> IPv4 datagram transit devices MUST NOT clear the DF bit.

which is now under the last call.

            Masataka Ohta

If one is so stupid to just block ICMP then one should also accept that one
loses functionality.

ICMP tends to get blocked by firewalls by default; There are
legitimate reasons to block ICMP, esp w V6. Security device
manufacturers tend to indicate all the "lost functionality" is
optional functionality not required for a working device.

If the people in the IETF would have decided to inline the headers that are
ICMPv6 into the IPv6 header then there for sure would have been people who
would have blocked the equivalent of PacketTooBig in there too. As long as

Over reliance on "PacketTooBig" is a source of the problem; the idea
that too large packets should be blindly generated under ordinary
circumstances, carried many hops, and dropped with an error returned a
potentially long distance that the sender in each direction is
expected to see and act upon, at the expense of high latency for both
peers, during initial connection establishment.

Routers don't always know when a packet is too big to reach their next
hop, especially in case of Broadcast traffic, so they don't know to
return a PacketTooBig error, especially in the case of L2 tunneling
PPPoE for example, there may be a L2 bridge on the network in between
routers with a lower MRU than either of the router's immediate links,
eg because PPP, 802.1p,q + MPLS labels, or other overhead are affixed
to Ethernet frames, somewhere on the switched path between routers.

The problem is not that "Tunneling is bad"; the problem is the IP
protocol has issues. The protocol should be designed so that there
will not be issues with tunnelling or different MRU Ethernet links.

The real solution is for reverse path MTU (MRU) information to be
discovered between L3 neighbors by L2 probing, and discovered MRU
exchanged using NDP, so routers know the lowest MRU on each directly
connected interface, then for the worst case reduction in reverse path
MTU to be included in the routing information passed via L3 routing
protocols both IGPs and EGPs to the next hop.

That is, no router should be allowed to enter a route into its
forwarding table, until the worst case reverse MTU is discovered, to
reach that network, with the exception, that a device may be
configured with a default route, and some directly connected networks.

The need for "Too Big" messages is then restricted to nodes connected
to terminal networks. And there should be no such thing as packet
fragmentation.

Joe Maimon wrote:

So IPv6 fixes the fragmentation and MTU issues of IPv4 by how exactly?

Completely wrongly.

Got a better solution? :wink:

Or was the fix incorporating the breakage into the basic design?

Yes.

Because IPv6 requires ICMP packet too big generated against
multicast, it is designed to cause ICMP implosions, which
means ISPs must filter ICMP packet too big at least against
multicast packets and, as distinguishing them from unicast
ones is not very easy, often against unicast ones.

I do not see the problem that you are seeing, to adress the two issues in your slides:
- for multicast just set your max packetsize to 1280, no need for pmtu and thus this "implosion"
    You think might happen. The sender controls the packetsize anyway and one does not want
    to frag packets for multicast thus 1280 solves all of it.

- when doing IPv6 inside IPv6 the outer path has to be 1280+tunneloverhead, if it is not then
    you need to use a tunneling protocol that knows how to frag and reassemble as is acting as a
    medium with an mtu less than the minimum of 1280

Greets,
Jeroen

If one is so stupid to just block ICMP then one should also accept that one
loses functionality.

ICMP tends to get blocked by firewalls by default

Which firewall product does that?

; There are
legitimate reasons to block ICMP, esp w V6.

The moment one decides to block ICMPv6 you are likely breaking features of IPv6, chose wisely. There are several RFCs pointing out what one could and what one Must never block. Packet Too Big is a very well known one that one should not block.

If you decide to block anyway then well, your problem that your network breaks.

  Security device
manufacturers tend to indicate all the "lost functionality" is
optional functionality not required for a working device.

I suggest that you vote with your money and chose a different vendor if they shove that through your throat. Upgrading braincells is another option though :wink:

If the people in the IETF would have decided to inline the headers that are
ICMPv6 into the IPv6 header then there for sure would have been people who
would have blocked the equivalent of PacketTooBig in there too. As long as

Over reliance on "PacketTooBig" is a source of the problem; the idea
that too large packets should be blindly generated under ordinary
circumstances, carried many hops, and dropped with an error returned a
potentially long distance that the sender in each direction is
expected to see and act upon, at the expense of high latency for both
peers, during initial connection establishment.

High latency? You do realize that it is only one roundtrip max that might happen and that there is no shorter way to inform your side of this situation?

Routers don't always know when a packet is too big to reach their next
hop, especially in case of Broadcast traffic,

You do realize that IPv6 does not have the concept of broadcast do you?! :wink:

There is only: unicast, multicast and anycast
(and anycast is just unicast as it is a routing trick)

so they don't know to
return a PacketTooBig error, especially in the case of L2 tunneling
PPPoE for example, there may be a L2 bridge on the network in between
routers with a lower MRU than either of the router's immediate links,
eg because PPP, 802.1p,q + MPLS labels, or other overhead are affixed
to Ethernet frames, somewhere on the switched path between routers.

If you have a broken L2 network there is nothing that an L3 protocol can do about it.
Please properly configure it, stuff tend to work better that way.

The problem is not that "Tunneling is bad"; the problem is the IP
protocol has issues. The protocol should be designed so that there
will not be issues with tunnelling or different MRU Ethernet links.

There is no issue as long as you properly respond with PtB and process them when received.
If your medium is <1280 then your medium has to solve the fragging of packets.

The real solution is for reverse path MTU (MRU) information to be
discovered between L3 neighbors by L2 probing, and discovered MRU
exchanged using NDP, so routers know the lowest MRU on each directly
connected interface, then for the worst case reduction in reverse path
MTU to be included in the routing information passed via L3 routing
protocols both IGPs and EGPs to the next hop.

You do realize that NDP only works on the local link and not further?! :wink:

Also, carrying MTU and full routing info to end hosts is definitely not something a lot of operators would like to do let alone see in their networks. Similar to you not wanting ICMP in your network even though that is the agreed upon standard.

That is, no router should be allowed to enter a route into its
forwarding table, until the worst case reverse MTU is discovered, to
reach that network, with the exception, that a device may be
configured with a default route, and some directly connected networks.

If you want this in your network just configure it everywhere to 1280 and then process and answer PtBs on the edge. Your network, your problem that you will never use jumbo frames.

The need for "Too Big" messages is then restricted to nodes connected
to terminal networks. And there should be no such thing as packet
fragmentation.

The fun thing is though that this Internet thing is quite a bit larger than your imaginary network...

Greets,
Jeroen

If one is so stupid to just block ICMP then one should also accept that one
loses functionality.

ICMP tends to get blocked by firewalls by default; There are
legitimate reasons to block ICMP, esp w V6. Security device
manufacturers tend to indicate all the "lost functionality" is
optional functionality not required for a working device.

If you feel the need to block ICMP (I'm not convinced this is an actual need),
then you should do so very selectively in IPv6.

Blocking packet too big messages, especially is definitely harmful in IPv6 and
PMTU-D is _NOT_ optional functionality.

Any firewall/security device manufacturer that says it is will not get any
business from me (or anyone else who considers their requirements
properly before purchasing).

If the people in the IETF would have decided to inline the headers that are
ICMPv6 into the IPv6 header then there for sure would have been people who
would have blocked the equivalent of PacketTooBig in there too. As long as

Over reliance on "PacketTooBig" is a source of the problem; the idea
that too large packets should be blindly generated under ordinary
circumstances, carried many hops, and dropped with an error returned a
potentially long distance that the sender in each direction is
expected to see and act upon, at the expense of high latency for both
peers, during initial connection establishment.

Actually, this generally will NOT affect initial connection establishment and
due to slow start usually adds a very small amount of latency about 3-5kb
into the conversation.

Routers don't always know when a packet is too big to reach their next
hop, especially in case of Broadcast traffic, so they don't know to
return a PacketTooBig error, especially in the case of L2 tunneling
PPPoE for example, there may be a L2 bridge on the network in between
routers with a lower MRU than either of the router's immediate links,
eg because PPP, 802.1p,q + MPLS labels, or other overhead are affixed
to Ethernet frames, somewhere on the switched path between routers.

That is a misconfiguration of the routers. Any routers in such a circumstance
need their interface configured for the lower MTU or things are going to break
with or without ICMP Packet Too Big messages because even if you didn't
have the DF bit, the router has no way to know to fragment the packet.

An L2 device should not be fragmenting L3 packets.

The problem is not that "Tunneling is bad"; the problem is the IP
protocol has issues. The protocol should be designed so that there
will not be issues with tunnelling or different MRU Ethernet links.

And there are not issues so long as things are configured correctly.
Misconfiguration will cause issues no matter how well the protocol
is designed. The problem you are describing so far is not a problem
with the protocol, it is a problem with misconfigured devices.

The real solution is for reverse path MTU (MRU) information to be
discovered between L3 neighbors by L2 probing, and discovered MRU
exchanged using NDP, so routers know the lowest MRU on each directly
connected interface, then for the worst case reduction in reverse path
MTU to be included in the routing information passed via L3 routing
protocols both IGPs and EGPs to the next hop.

This could compensate for some amount of misconfiguration, but you're
adding a lot of overhead and a whole bunch of layering violations in
order to do it. I think it would be much easier to just fix the configuration
errors.

That is, no router should be allowed to enter a route into its
forwarding table, until the worst case reverse MTU is discovered, to
reach that network, with the exception, that a device may be
configured with a default route, and some directly connected networks.

I don't see how this would no cause more problems than you claim it
will solve.

The need for "Too Big" messages is then restricted to nodes connected
to terminal networks. And there should be no such thing as packet
fragmentation.

There should be no such thing as packet fragmentation in the current
protocol. What is needed is for people to simply configure things
correctly and allow PTB messages to pass as designed.

Owen

An L2 device should not be fragmenting L3 packets.

Layer 2 fragmentation used (20+ years ago) to be a common thing with bridged topologies like token-ring to Ethernet source-routing. Obviously, no so much anymore (at least I hope not), but it can and does happen.

I think part of the problem is that ISPs, CDN, hosting companies, etc. have assumed IPv6 is just IPv4 with longer addresses and haven't spent the time learning the differences like what was pointed out that ICMPv6 is a required protocol for IPv6 to work correctly. MTU issues are an annoyance with IPv4 but are a brokenness with IPv6. Knowledge with come, but it may take a bit of beating over the head for a while.

Jeroen Massar wrote:

So IPv6 fixes the fragmentation and MTU issues of IPv4 by how exactly?

Completely wrongly.

Got a better solution? :wink:

IPv4 without PMTUD, of course.

Because IPv6 requires ICMP packet too big generated against
multicast, it is designed to cause ICMP implosions, which
means ISPs must filter ICMP packet too big at least against
multicast packets and, as distinguishing them from unicast
ones is not very easy, often against unicast ones.

I do not see the problem that you are seeing, to adress the two
issues in your slides:
  - for multicast just set your max packetsize to 1280, no
    need for pmtu and thus this "implosion"

It is a sender of a multicast packet, not you as some ISP,
who set max packet size to 1280B or 1500B.

You can do nothing against a sender who consciously (not
necessarily maliciously) set it to 1500B.

The only protection is not to generate packet too big and
to block packet too big at least against multicast packets.

If you don't want to inspect packets so deeply (beyond first
64B, for example), packet too big against unicast packets
are also blocked.

That you don't enable multicast in your network does not
mean you have nothing to do with packet too big against
multicast, because you may be on a path of returning ICMPs.
That is, you should still block them.

     You think might happen. The sender controls the packetsize
     anyway and one does not want
     to frag packets for multicast thus 1280 solves all of it.

That's what I said in IETF IPv6 WG more than 10 years ago, but
all the other WG members insisted on having multicast PMTUD,
ignoring the so obvious problem of packet implosions.

Thus, RFC2463 requires:

   Sending a Packet Too Big Message makes an exception to one
   of the rules of when to send an ICMPv6 error message, in that
   unlike other messages, it is sent in response to a packet

Jeroen Massar wrote:

So IPv6 fixes the fragmentation and MTU issues of IPv4 by how exactly?

Completely wrongly.

Got a better solution? :wink:

IPv4 without PMTUD, of course.

We are (afaik) discussing IPv6 in this thread, I assume you typo'd here :wink:

Because IPv6 requires ICMP packet too big generated against
multicast, it is designed to cause ICMP implosions, which
means ISPs must filter ICMP packet too big at least against
multicast packets and, as distinguishing them from unicast
ones is not very easy, often against unicast ones.

I do not see the problem that you are seeing, to adress the two
issues in your slides:
- for multicast just set your max packetsize to 1280, no
   need for pmtu and thus this "implosion"

It is a sender of a multicast packet, not you as some ISP,
who set max packet size to 1280B or 1500B.

If a customer already miraculously has the rare capability of sending multicast packets in the rare case that a network is multicast enabled then they will also have been told to use a max packet size of 1280 to avoid any issues when it is expected that some endpoint might have that max MTU.

I really cannot see the problem with this as multicast networks tend to be rare and very much closed. Heck, for that matter the m6bone is currently pretty much in a dead state for quite a while already.... :frowning:

You can do nothing against a sender who consciously (not
necessarily maliciously) set it to 1500B.

Of course you can, the first hop into your network can generate a single PtB and presto the issue becomes a problem of the sender. As the sender's intention is likely to reach folks they will adhere to that advice too instead of just sending packets which get rejected at the first hop.

The only protection is not to generate packet too big and
to block packet too big at least against multicast packets.

No need, as above, reject and send PtB and all is fine.

If you don't want to inspect packets so deeply (beyond first
64B, for example), packet too big against unicast packets
are also blocked.

Routing (forwarding packets) is in no way "expection".

That you don't enable multicast in your network does not
mean you have nothing to do with packet too big against
multicast, because you may be on a path of returning ICMPs.
That is, you should still block them.

Blocking returning ICMPv6 PtB where you are looking at the original packet which is echod inside the data of the ICMPv6 packet would indeed require one to look quite deep, but if one is so determined to firewall them well, then you would have to indeed.

I do not see a reason to do so though. Please note that the src/dst of the packet itself is unicast even if the PtB will be for a multicast packet.

I guess one should not be so scared of ICMP, there are easier ways to overload a network. Proper BCP38 goes a long way.

    You think might happen. The sender controls the packetsize
    anyway and one does not want
    to frag packets for multicast thus 1280 solves all of it.

That's what I said in IETF IPv6 WG more than 10 years ago, but
all the other WG members insisted on having multicast PMTUD,
ignoring the so obvious problem of packet implosions.

They did not ignore you, they realized that not everybody has the same requirements. With the current spec you can go your way and break pMTU requiring manual 1280 settings, while other networks can use pMTU in their networks. Everbody wins.

So, you should assume some, if not all, of them still insist
on using multicast PMTUD to make multicast packet size larger
than 1280B.

As networks become more and more jumbo frame enabled, what exactly is the problem with this?

In addition, there should be malicious guys.

- when doing IPv6 inside IPv6 the outer path has to be
   1280+tunneloverhead, if it is not then

Because PMTUD is not expected to work,

You assume it does not work, but as long as per the spec people do not filter it, it works.

you must assume MTU
of outer path is 1280B, as is specified "simply restrict
itself to sending packets no larger than 1280 octets" in
RFC2460.

While for multicast enabled networks that might hit the minimum MTU this might be true-ish, it does not make it universally true.

    you need to use a tunneling protocol that knows how to
    frag and reassemble as is acting as a
    medium with an mtu less than the minimum of 1280

That's my point in my second last slide.

Then you word it wrongly. It is not the problem of IPv6 that you chose to layer it inside so many stacks that the underlying medium cannot transport packets bigger as 1280, that medium has to take care of it.

Considering that many inner packet will be just 1280B long,
many packets will be fragmented, as a result of stupid attempt
to make multicast PMTUD work, unless you violate RFC2460
to blindly send packets a little larger than 1280B.

Your statement only works when:
- you chose a medium unable to send packets with a minimum of 1280
   Which thus makes the medium IPv6 incapable, the mediums issue to frag
- someone filters ICMP PtB even though one should not
- when in the rare case with the above someone actually uses interdomain multicast

I hope you see how much of a non-issue this thus is.

Please fix your network instead, kthx.

Greets,
Jeroen

Unfortunately many technology people seem to have the idea, "If I don't understand it, it's a hacker" when it comes to network traffic. And often they don't understand ICMP (or at least PMTU). So anything not understood gets blocked. Then there is the Law of HTTP...

The Law of HTTP is pretty simple: Anything that isn't required for *ALL* HTTP connections on day one of protocol implementation will never be able to be used universally.

This includes, sadly, PMTU. If reaching all possible endpoints is important to your application, you better do it via HTTP and better not require PMTU. It's also why protocols typically can't be extended today at any layer other than the "HTTP" layer.

As for the IETF trying to not have people reset DF...good luck with that one...besides, I think there is more broken ICMP handling than there are paths that would allow a segment to bounce around for 120 seconds...

He is comparing & contrasting with the behavior of IPv4 v IPv6.

If your PMTU is broken for v4 because people do wholesale blocks of ICMP, there is a chance they will have the same problem with wholesale blocks of ICMPv6 packets.

The interesting thing about IPv6 is it's "just close enough" to IPv4 in many ways that people don't realize all the technical details. People are still getting it wrong with IPv4 today, they will repeat their same mistakes in IPv6 as well.