IPv6 "bloat"

So out of the current discussions a lot of people have claimed that ipv6 is bloated or suffers from second system syndrome, etc. So I decided to look at a linux kernel (HEAD I assume) and look at the differences between the v6 and v4 directories. I just crudely did a line count as a quick measure:

ipv6: 68k lines

ipv4: 97k lines

ipv4 looks to have the tcp and udp implementations (35k) so backing that out it is about 62k lines. That's pretty comparable. Linux has full routing capability so the kernel implements it for both.

So I'm just not getting where this "bloat" is. 10% growth for a second system syndrome seems almost miraculously good, imo.

What am i missing? This is in complete agreement with my intuition 30 years ago that it was no big deal, at least from a software standpoint.

Mike

It has "features" which are at a minimum problematic and at a maximum show stoppers for network operators.

IPv6 seems like it was designed to be a private network communication stack, and how an ISP would use and distribute it was a second though.

It has "features" which are at a minimum problematic and at a maximum show stoppers for network operators.

IPv6 seems like it was designed to be a private network communication stack, and how an ISP would use and distribute it was a second though.

What might those be? And it doesn't seem to be a show stopper for a lot of very large carriers.

Mike

Primarily the ability to end-to-end authenticate end devices. The primary and largest glaring issue is that DHCPv6 from the client does not include the MAC address, it includes the (I believe) UUID.

We have to sniff the packets to figure out the MAC so that we can authenticate the client and/or assign an IP address to the client properly.

It depends how you're managing the network. If you're running PPPoE you can encapsulate in that. But PPPoE is very 1990 and has its own set of problems. For those running encapsulated traffic, authentication to the modem MAC via DHCP that becomes broken. And thus far, I have not seen a solution offered to it.

Secondly - and less importantly to deployment, IPv6 also provides a layer of problematic tracking for advertisers. Where as before many devices were behind a PAT, now every device has a unique ID -- probably for the life of the device. Marketers can now pinpoint down not just to an IP address that identifies a single NAT interface, but each individual device. This is problematic from a data collection standpoint.

It has "features" which are at a minimum problematic and at a maximum show stoppers for network operators.

IPv6 seems like it was designed to be a private network communication stack, and how an ISP would use and distribute it was a second though.

What might those be? And it doesn't seem to be a show stopper for a lot of very large carriers.

Primarily the ability to end-to-end authenticate end devices. The primary and largest glaring issue is that DHCPv6 from the client does not include the MAC address, it includes the (I believe) UUID.

We have to sniff the packets to figure out the MAC so that we can authenticate the client and/or assign an IP address to the client properly.

It depends how you're managing the network. If you're running PPPoE you can encapsulate in that. But PPPoE is very 1990 and has its own set of problems. For those running encapsulated traffic, authentication to the modem MAC via DHCP that becomes broken. And thus far, I have not seen a solution offered to it.

I was honestly more interested in the bloat angle, but this sounds like a backend problem of your own making most likely. But I'm not motivated to see if it's actually the case or just a misunderstanding.

Secondly - and less importantly to deployment, IPv6 also provides a layer of problematic tracking for advertisers. Where as before many devices were behind a PAT, now every device has a unique ID -- probably for the life of the device. Marketers can now pinpoint down not just to an IP address that identifies a single NAT interface, but each individual device. This is problematic from a data collection standpoint.

I guess you've not heard of privacy addresses. Or DHCPv6.

Mike

I misspoke... it's not UUID... It's DUID.

This isn't a backend management issue. This is a protocol issue. The MAC of the interface needs to be sent with a DHCP request so that it can be properly authenticated to the physical device.

As long as the client and DHCPv6 server are on the same network interface -- it all works fine. However, when you relay that information, you now lose the MAC address information.

Further, because the MAC is disconnected in IPv6 it becomes more difficult to make the connection between IPs on a dual-stack client.

Everyone prints the MAC (a unique ID on devices and devices packaging). Almost nobody prints the DUID on a device, so how do you pre-populate your DHCP server? I can see that it encourages "one interface per network" and so encourages bonding, bridging or whatever, but is being able to differentiate the interfaces of a host really so bad? I can't help but feel that it would have been nice for DHCPv6 to send DUID and MAC.

Primarily the ability to end-to-end authenticate end devices. The
primary and largest glaring issue is that DHCPv6 from the client does
not include the MAC address, it includes the (I believe) UUID.

DHCPv6 Option 79

https://datatracker.ietf.org/doc/html/rfc6939

Thanks, I didn’t think that they’d something that interfered with AAA. Using a MAC address as authentication seems sort of sketch to me in the first place.

Mike

On a public network (such as WiFi - sure). On a private network where the only authentication taking place is to the modem which is provided by the service provider, not so much. It's a closed environment. The modem demarcs to the end-user and the end-user never touches the switching fabric.

Interesting about DHCPv6 Option 79. I had not run across that before. I will look into that more. Thank you.

DHCPv6 includes the DEVICE Unique Identifier (DUID). DUID can be any one of several things.

By far, the most common ones actually do include the MAC address.

Some systems allow you to choose which type of DUID they supply.

Macs use a long string that includes the EUI-64 at the end:
(an expert from a static host entry in dhcpv6d.conf for a Mac host:
  host-identifier option dhcp6.client-id 00:01:00:01:23:d6:92:16:68:fe:f7:07:11:6f;
  hardware ethernet 68:fe:f7:07:11:6f;
)

Some hosts don’t provide the MAC address, but they provide a device unique identifier which is equally useful for authentication, frankly.
For example, a Raritan KVM:
  host-identifier option dhcp6.client-id 00:02:00:00:35:ae:31:49:54:39:41:30:30:31:34:38 ;

HP Printers provide yet another format of DUID:
        host-identifier option dhcp6.client-id 00:01:00:01:01:e2:85:23:b8:db:ad:ba:db:ad ;

It’s a little more awkward than DHCPv4, but once you get used to it, it’s really not so bad. It’s a slight challenge for providing hosts reserved addresses, but otherwise, it’s just larger fields in the log entries.

Owen

Hi,

I misspoke... it's not UUID... It's DUID.

This isn't a backend management issue. This is a protocol issue. The
MAC of the interface needs to be sent with a DHCP request so that it can
be properly authenticated to the physical device.

As long as the client and DHCPv6 server are on the same network
interface -- it all works fine. However, when you relay that
information, you now lose the MAC address information.

RFC 6939 solves that, since a long time.
See also: Is RFC 6939 Support Finally Here – Checking the Implementation of the “Client Link Layer Address Option” in DHCPv6 – Insinuator.net

Further, because the MAC is disconnected in IPv6 it becomes more
difficult to make the connection between IPs on a dual-stack client.

Not sure if I agree with that either. That connection can be made by various other means, see

cheers

Enno

It seems sketchy to me to even retain client MAC information, no? Genuine question.

Didn’t we go to a distinct unique identifier system for this very reason?

Am I in the 1990s here or?

We’re just handing out addresses to UEs and things seem to work fine. For me personally, I find the notation of v6 to be very unasthetic, so I tend to just conceal it from myself now.

-LB

Ms. Lady Benjamin PD Cannon of Glencoe, ASCE
6x7 Networks & 6x7 Telecom, LLC
CEO
ben@6by7.net
"The only fully end-to-end encrypted global telecommunications company in the world.”
ANNOUNCING: 6x7 GLOBAL MARITIME

FCC License KJ6FJJ

It seems sketchy to me to even retain client MAC information, no? Genuine question.

Didn’t we go to a distinct unique identifier system for this very reason?

Am I in the 1990s here or?

We’re just handing out addresses to UEs and things seem to work fine. For me personally, I find the notation of v6 to be very unasthetic, so I tend to just conceal it from myself now.

Correct me if I’m wrong, but it sounds like you’re viewing this from a service provider perspective, in which case everything you’ve said is basically correct.

However, the enterprise world is very different. Right, wrong, or otherwise, many enterprises feel a strong compulsion to have very strict control over addressing and relatively direct accountability of “x address = y employee” (regardless of whether that’s actually true or not).

In those environments, yes, IPv6 does present a learning curve and some additional challenges. They are not insurmountable and if you were starting from scratch needing to build your enterprise on IPv6, it would actually be less difficult than IPv4. IPv4, however, has the advantage of well trodden paths for enterprise solutions in this space. IPv6 will get there as more enterprises start to deploy IPv6, but right now both sides of that process suffer from the classic chicken/egg problem. The problem is slow to get solved because there are no chickens asking for a solution. Chickens aren’t asking for it because there are no eggs to create chickens that need IPv6.

Owen

This is going to be one of the big things the US Federal govt requirements for agencies to meet the IPv6-only benchmarks will need. Solutions and products are going to have to mature quickly for agencies to hit 80% IPv6-only by end of FY25.

Michael Thomas wrote:

So out of the current discussions a lot of people have claimed that ipv6 is bloated or suffers from second system syndrome, etc.

IPv6 optional header chain, even after it was widely recognized
that IPv4 options are useless/harmful and were deprecated is an
example of IPv6 bloat.

Extensive use of link multicast for nothing is another example
of IPv6 bloat. Note that IPv4 works without any multicast.

So I decided to look at a linux kernel (HEAD I assume) and look at the differences between the v6 and v4 directories.

See above. That is an improper way to evaluate IPv6 bloat.

An example of second system syndrome of over-engineering
without bloat is various timing parameters specified
for ND, even though timing requirements are different
depending on link types, which means there can be no
standard timing parameters applicable to all the link
types.

Another example of over-engineering is SLAAC to
*statefully* maintain address configuration state
in fully distributed way only to promote
inconsistencies requiring DAD.

An example of under-engineering is lack of the
following consideration of rfc791:

     The number 576 is selected to allow a reasonable sized data block to
     be transmitted in addition to the required header information. For
     example, this size allows a data block of 512 octets plus 64 header
     octets to fit in a datagram. The maximal internet header is 60
     octets, and a typical internet header is 20 octets, allowing a
     margin for headers of higher level protocols.

as IPv6 optional headers can be arbitrary lengthy, it is not
guaranteed that 512B DNS message can be sent over UDP over
IPv6.

And, there are a lot lot lot more.

            Masataka Ohta

Michael Thomas wrote:

So out of the current discussions a lot of people have claimed that ipv6 is bloated or suffers from second system syndrome, etc.

IPv6 optional header chain, even after it was widely recognized
that IPv4 options are useless/harmful and were deprecated is an
example of IPv6 bloat.

Extensive use of link multicast for nothing is another example
of IPv6 bloat. Note that IPv4 works without any multicast.

Yes, but IPv6 works without any broadcast. At the time IPv6 was being developed, broadcasts were rather inconvenient and it was believed that ethernet switches (which were just beginning to be a thing then) would facilitate more efficient capabilities by making extensive use of link multicast instead of broadcast.

Guess what, we are again bad at predicting the future. You have no choice when developing something but to make the best guess about what will happen from the information available at the time. Turns out multicast was arguably a wrong guess, but all indications available at the time were that it was a good bet.

There is still a valid argument to be made that in a switched ethernet world, multicast could offer efficiencies if networks were better tuned to accommodate it vs. broadcast. That’d be IPv4 unfriendly, but in a world where IPv4 is eventually deprecated and broadcasts are no longer necessary, the potential is there.

So I decided to look at a linux kernel (HEAD I assume) and look at the differences between the v6 and v4 directories.

See above. That is an improper way to evaluate IPv6 bloat.

An example of second system syndrome of over-engineering
without bloat is various timing parameters specified
for ND, even though timing requirements are different
depending on link types, which means there can be no
standard timing parameters applicable to all the link
types.

Another example of over-engineering is SLAAC to
*statefully* maintain address configuration state
in fully distributed way only to promote
inconsistencies requiring DAD.

SLAAC doesn’t “statefully” maintain address state in the network or in remote systems. Obviously some level of statefulness is required on each local host or it would need to repeat the address acquisition process for each unconnected frame (whether initiating a connected session or a connectionless frame). DAD is there to avoid inconsistencies and more gracefully handle situations where addresses get duplicated. IPv4 is particularly bad at this and the “over engineering” you speak of here was seen as a solution to that problem in IPv4.

An example of under-engineering is lack of the
following consideration of rfc791:

   The number 576 is selected to allow a reasonable sized data block to
   be transmitted in addition to the required header information. For
   example, this size allows a data block of 512 octets plus 64 header
   octets to fit in a datagram. The maximal internet header is 60
   octets, and a typical internet header is 20 octets, allowing a
   margin for headers of higher level protocols.

as IPv6 optional headers can be arbitrary lengthy, it is not
guaranteed that 512B DNS message can be sent over UDP over
IPv6.

It is guaranteed that a 512 octet DNS message can be sent over UDP if it does not contain extension headers. You can argue that extension headers should have been more carefully considered in RFC791, but two factors come into play there:

  1. I’m not sure the idea of extension headers had been fleshed out by the time RFC791 was written. It is one
    of the earliest IPv6 RFCs.

  2. Even with full consideration of IPv6 extension headers, I think 576 is still a reasonable MINIMUM MTU,
    since a minimum MTU that can account for all extension headers would exceed the common 1500 octet
    MTU prevalent at the time (and still relatively prevalent today). It’s clear from the text of RFC791 that
    this number is by definition a compromise between competing factors, wherein there is on one side
    the desire to keep the minimum MTU as small as possible and on the other side, the need for the
    minimum MTU to accommodate a reasonable size payload under the majority of circumstances.
    For better or worse, I think that 576 is probably as good a compromise as can be reached.

So I disagree with your characterization of this as under-engineered.

Owen

Owen DeLong wrote:

IPv6 optional header chain, even after it was widely recognized that IPv4 options are useless/harmful and were deprecated is an example of IPv6 bloat.

Extensive use of link multicast for nothing is another example of
IPv6 bloat. Note that IPv4 works without any multicast.

Yes, but IPv6 works without any broadcast. At the time IPv6 was being
developed, broadcasts were rather inconvenient and it was believed
that ethernet switches (which were just beginning to be a thing then)
would facilitate more efficient capabilities by making extensive use
of link multicast instead of broadcast.

No, the history around it is that there was some presentation
in IPng WG by ATM people stating that ATM, or NBMA (Non-Broadcast
Multiple Access)in general, is multicast capable though not
broadcast capable, which was blindly believed by most, if not
all excluding *me*, people there.

It should be noted that IPv6 was less bloat because
ND abandoned its initial goal to support IP over NBMA.

> Turns
> out multicast was arguably a wrong guess, but all indications
> available at the time were that it was a good bet.

See above.

> There is still a valid argument to be made that in a switched
> ethernet world, multicast could offer efficiencies if networks were
> better tuned to accommodate it vs. broadcast.

That is against the CATENET model that each datalink only
contain small number of hosts where broadcast is not a
problem at all. Though, in CERN, single Ethernet with
thousands of hosts was operated, of course poorly, it
was abandoned to be inoperational a lot before IPv6,
which is partly why IPv6 is inoperational.

            Masataka Ohta

Admitting to not having read every message in these threads,
but would like to highlight a bit of the history.

IMnsHO, the otherwise useful history is missing a few steps.

  1) The IAB selected ISO CLNP as the next version of IP.

  2) The IETF got angry, disbanded, replaced, and renamed IAB.

  3) On the Big-Internet list, my Practical Internet Protocol Extensions
     (PIPE) was an early proposal, and I'd registered V6 with IANA.

     I was self-funding. PIPE was cognizant of the needs of ISPs and
     deployment.

  4) Lixia Zhang wrote me that Steve Deering was proposing something
     similar, and urged us to pool our efforts. That became Simple
     Internet Protocol (SIP). We used 64 bit addresses. We had a clear
     path for migration, using the upper 32-bits for the ASN and the old
     IPv4 address in the lower 32-bits. We had running code.

  5) The IP Address Extension (IPAE) proposal had some overlapping features,
     and we asked them to merge with us. That added some complexity.

  6) The Paul Francis (the originator of NAT) Polymorphic Internet Protocol
     (PIP) had some overlapping features, so we also asked them to merge
     with us (July 1993). More complexity in the protocol header chaining.

  7) The result was SIPP. We had 2 interoperable implementations: Naval
     Research Labs, and KA9Q NOS (Phil Karn and me). There were others
     well underway.

  8) As noted by John Curran, there was a committee of "powers that be".
     After IETF had strong consensus for SIPP, and we had running code,
     the "powers that be" decided to throw all that away.

  9) The old junk was added back into IPv6 by committee.

There was also a mention that the Linux IP stack is fairly compact and
that IPv6 is somewhat smaller than the IPv4. That's because the Linux
stack was ported by Alan Cox from KA9Q NOS. We gave Alan permission to
change from our personal copyright to GPL.

It has a lot of the features we'd developed, such as packet buffers and
pushdown functions for adding headers, complimentary to BSD pullup.
They made SIPP/IPv6 fairly easy to implement.

Owen DeLong wrote:

IPv6 optional header chain, even after it was widely recognized that IPv4 options are useless/harmful and were deprecated is an example of IPv6 bloat.

Extensive use of link multicast for nothing is another example of
IPv6 bloat. Note that IPv4 works without any multicast.

Yes, but IPv6 works without any broadcast. At the time IPv6 was being
developed, broadcasts were rather inconvenient and it was believed
that ethernet switches (which were just beginning to be a thing then)
would facilitate more efficient capabilities by making extensive use
of link multicast instead of broadcast.

No, the history around it is that there was some presentation
in IPng WG by ATM people stating that ATM, or NBMA (Non-Broadcast
Multiple Access)in general, is multicast capable though not
broadcast capable, which was blindly believed by most, if not
all excluding *me*, people there.

Both Owen and Masataka are correct, in their own way.

IPv4 options were recognized as harmful. SIPP used header chains instead.
But the whole idea was to speed processing, eliminating hop-by-hop.

Then the committees added back the hop by hop processing (type 0).
Terrible!

Admittedly, I was also skeptical of packet shredding (what we called
ATM). Sadly, the Chicago NAP required ATM support, and that's where
my connections were located.

It should be noted that IPv6 was less bloat because
ND abandoned its initial goal to support IP over NBMA.

Neighbor Discovery is/was agnostic to NBMA. Putting all the old
ARP and DHCP and other cruft into the IP-layer was my goal, so
that it would be forever link agnostic.

> There is still a valid argument to be made that in a switched
> ethernet world, multicast could offer efficiencies if networks were
> better tuned to accommodate it vs. broadcast.

That is against the CATENET model that each datalink only
contain small number of hosts where broadcast is not a
problem at all. Though, in CERN, single Ethernet with
thousands of hosts was operated, of course poorly, it
was abandoned to be inoperational a lot before IPv6,
which is partly why IPv6 is inoperational.

Yes, we were also getting a push from Fermi Labs and CERN for very
large numbers of nodes per link, rather than old ethernet maximum.

That's the underlying design for Neighbor Discovery. Less chatty.

Also, my alma mater was Michigan State University, operating the
largest bridged ethernet in the world in the '80s. Agreed, it was
"inoperational". My epiphany was splitting it with KA9Q routers.

Suddenly the engineering building and the computing center each had
great throughput. Turns out it was the administration's IBM that
had been clogging the campus. Simple KA9Q routers didn't pass the
bad packets. That's how I'd become a routing over bridging convert.

Still, there are data centers with thousand port switches.

Also, TRILL.

William Allen Simpson wrote:

6) The Paul Francis (the originator of NAT) Polymorphic Internet Protocol
(PIP) had some overlapping features, so we also asked them to merge
with us (July 1993). More complexity in the protocol header chaining.

With the merger, Paul Francis was saying he was unhappy
because PIP is dead. So the merger is not voluntary for
him and the added complexity is technically meaningless.

IPv4 options were recognized as harmful. SIPP used header chains instead.
But the whole idea was to speed processing, eliminating hop-by-hop.

Then the committees added back the hop by hop processing (type 0).
Terrible!

Really? But, rfc1710 states:

    The SIPP option headers which are currently defined are:

      Hop-by-Hop Option Special options which require hop by hop
                                 processing

Admittedly, I was also skeptical of packet shredding (what we called
ATM).

Packet shredding harmed router architecture, not protocols.
Many routers are shredding packets internally for no good
reason.

Instead, ATM-centric view that "all the world will be
covered by ATM and global IP could be used only over ATM"
harmed protocols including IPv6 a lot.

Neighbor Discovery is/was agnostic to NBMA. Putting all the old
ARP and DHCP and other cruft into the IP-layer was my goal, so
that it would be forever link agnostic.

To make "IP uber alles", link-dependent adaptation mechanisms
between IP and links are necessary. So, "ND uber alles" is a
wrong goal.

Yes, we were also getting a push from Fermi Labs and CERN for very
large numbers of nodes per link, rather than old ethernet maximum.

And the reason was that they want to use IPv4 and
DECNET without being annoyed by *dual* *stack* routers!

That's the underlying design for Neighbor Discovery. Less chatty.

Just wrong requirement.

Large Ethernet segments work poorly by itself. Worse, even
if some L3 might be causing additional damages, the
damages won't go away if IPv6 is added.

Turns out it was the administration's IBM that
had been clogging the campus.

SNA?

            Masataka Ohta

William Allen Simpson wrote:

6) The Paul Francis (the originator of NAT) Polymorphic Internet Protocol
(PIP) had some overlapping features, so we also asked them to merge
with us (July 1993). More complexity in the protocol header chaining.

With the merger, Paul Francis was saying he was unhappy
because PIP is dead. So the merger is not voluntary for
him and the added complexity is technically meaningless.

He seemed happy at the Amsterdam 1993 meeting, but as time went on he was
sidelined. Likewise, I eventually regretted having joined with the others.
We lost control of the main ideas.

For example, originally V6 was designed to use shortest path first
interior routing. All the announcements were Link State, everything was in
place. I still wince at the memory of the PARC meeting where Eric stated
that RIP was good enough for V4, so it is good enough for V6.

Then he was assigned to be my "co-author". So I quit.

What you know as Neighbor Discovery was not the original design. Nor was
RIPv6 needed.

When I was giving a talk at Google 25 years later I was asked why that
happened (by a then member of the IAB). A sore spot, long remembered.
Committee-itis at its worst.

IPv4 options were recognized as harmful. SIPP used header chains instead.
But the whole idea was to speed processing, eliminating hop-by-hop.

Then the committees added back the hop by hop processing (type 0).
Terrible!

Really? But, rfc1710 states:

The SIPP option headers which are currently defined are:

  Hop\-by\-Hop Option          Special options which require hop by hop
                             processing

Yep, that was one of the reasons I quit.

Digging out my files, I'd forked my documents by July 17, 1994. (That's
the last date I'd touched them, so it was before then.) RFC 1710 was later.

Also, I registered IPvB with Jon Postel.

These are all old nroff files, but I could hand format a bit and post
things here. Not that it makes much difference today, yet some of my
ideas made it into Fibre Channel and InfiniBand.