Introducing draft-denog-v6ops-addresspartnaming

Joel_Jaeggli · November 19, 2010, 8:19pm

It is always two bytes. A byte is not always an octet. Some machines do

It is always two OCTETS. A byte is not always an octet...

Assuming you have a v6 stack on your cdc6600 a v6 address fits in 22
bytes not 16.

have byte sizes other than 8 bits, although few of them are likely to have
IPv6 stacks, so, this may be an academic distinction at this point.

One can define that byte size for the purposes of the human reading of
addresses ipv6 as 8 bits, without getting into machine specific details.
what's important to the machine isn't the division of the address into
parts (they aren't divided in the machine representation it's just one
long row of bits) but rather where the mask falls.

William_Herrin · November 19, 2010, 8:45pm

Hi Richard,

I have an anti-naming proposal: Allow users to place the colons
-anywhere- or even leave them out altogether without changing the
semantics of the IPv6 address.

The colons are there for readability purposes only. They have no
special significance and should not be elevated to significance by
naming the parts of the address they delineate. Treat them specially
and some fools will attach importance to arranging tasks on two-byte
boundaries.

The meaningful boundaries in the protocol itself are nibble and /64.
If you want socially significant boundaries, add /12, /32 and /48.

Regards,
Bill Herrin

Joel_Jaeggli · November 19, 2010, 9:09pm

as most of you are aware, there is no definite, canonical name for the
two bytes of IPv6 addresses between colons. This forces people to use
a description like I just did instead of a single, specific term.

Hi Richard,

I have an anti-naming proposal: Allow users to place the colons
-anywhere- or even leave them out altogether without changing the
semantics of the IPv6 address.

The colons are there for readability purposes only. They have no
special significance and should not be elevated to significance by
naming the parts of the address they delineate. Treat them specially
and some fools will attach importance to arranging tasks on two-byte
boundaries.

The meaningful boundaries in the protocol itself are nibble and /64.
If you want socially significant boundaries, add /12, /32 and /48.

It is possible and desirable to be able to describe any mask length
between /0 and /128. the /64 is an important demarcation point for
subnets but everything shorter than that will appear in your routing table.

William_Herrin · November 19, 2010, 9:17pm

Hi Joel,

Bit, nibble and /64 then. /64 is treated specially by functions in the
protocol (like SLAAC) thus it's a protocol boundary rather than a
social one (/12 IANA allocations, /32 ISP allocations, /48 end-user
assignments).

Unless you particularly feel the need to assign /64's to router
loopbacks, you'll see plenty of routes longer than /64 in your table
too.

Regards,
Bill Herrin

Richard_Hartmann · November 19, 2010, 9:20pm

I have an anti-naming proposal: Allow users to place the colons
-anywhere- or even leave them out altogether without changing the
semantics of the IPv6 address.

A decade or two of established syntax disagree. IPv6 addresses, UUIDs
and similar have a unique syntax for a reason. Otherwise, we, nor
computers, wouldn't be able to quickly distinguish an IP from a hash.

The colons are there for readability purposes only. They have no
special significance and should not be elevated to significance by
naming the parts of the address they delineate. Treat them specially
and some fools will attach importance to arranging tasks on two-byte
boundaries.

Even if they were for readability only, they would still be for
humans. Same as the specific, canonical name we are trying to agree
on.

If people want to interpret more into the colons than there is to see,
they will do so regardless of a name.

The rest of us will work faster, more efficiently and not explain the
same old thing a gazillion times.

Richard

Richard_Hartmann · November 19, 2010, 9:31pm

Bit, nibble and /64 then. /64 is treated specially by functions in the
protocol (like SLAAC) thus it's a protocol boundary rather than a
social one (/12 IANA allocations, /32 ISP allocations, /48 end-user
assignments).

I would argue that /0 and /128 are somewhat special, too.

Unless you particularly feel the need to assign /64's to router
loopbacks, you'll see plenty of routes longer than /64 in your table
too.

That's a personal preference, really. Unless you mess up, or are an
end user permanently stuck with a /64 (in which case your ISP messed
up), there isn't really much need to assign anything longer, though.
That being said, for whatever reason, several of my upstreams use /126
for their sessions.

In any case, other than "some people might see the colons as magic
markers" I don't really see an argument in favour of avoiding a common
name. And that does not seem to hold much water. This is not meant to
be an attack, I simply wonder if I am missing something.

Richard

William_Herrin · November 19, 2010, 10:52pm

I have an anti-naming proposal: Allow users to place the colons
-anywhere- or even leave them out altogether without changing the
semantics of the IPv6 address.

A decade or two of established syntax disagree. IPv6 addresses, UUIDs
and similar have a unique syntax for a reason. Otherwise, we, nor
computers, wouldn't be able to quickly distinguish an IP from a hash.

Hi Richard,

I thought about that. Have a "one colon rule" that IPv6 addresses in
hexidecimal format have to include at least one colon somewhere. The
regex which picks that token out versus the other possibilities is
easy enough to write and so is the human rule: "Oh, it's got
hexidecimal digits and a colon in it. IPv6 address."

There is one serious problem with switching notations: we've already
started dropping the leading 0's inside each coloned-off section, and
that would have a different meaning if the colons could be placed
anywhere.

fd00:68::1 and fd:0068::1 mean different things now. The former means
fd00:0068::1 while the latter means 00fd:0068::1. I would instead have
them mean the same thing: fd00:6800::1. The single-colon separator
gets syntax but no semantics and the :: separator means "all middle
nibbles are zero" instead of "all middle two-byte components are
zero."

I mean, when you think about it, the consequence that :: means "all
middle two-byte components are zero" is kinda weird.

Even if they were for readability only, they would still be for
humans. Same as the specific, canonical name we are trying to agree
on.

If people want to interpret more into the colons than there is to see,
they will do so regardless of a name.

Anything you call out will be interpreted as special. The more you
call it out, the greater the expectation that the distinction is
important. That's human nature.

You've explained netmasks before to folks whose brains couldn't get
past the dots in the address. We all have. And referring to IP address
notation as "dotted quads" just reinforces classful addressing
concepts so that folks assigning themselves 10/8 subnets damn near
always split on /16 and /24 boundaries.

The rest of us will work faster, more efficiently and not explain the
same old thing a gazillion times.

And even more efficiently when we don't have to repeatedly explain
that the mental model implied by the notation style is, in fact, not
how the technology actually works.

In any case, other than "some people might see the colons as magic
markers" I don't really see an argument in favour of avoiding a common
name. And that does not seem to hold much water. This is not meant to
be an attack, I simply wonder if I am missing something.

No sweat. When I shoot my mouth off, I expect to be challenged on the
remarks. Part of the fun lies in discovering whether the thesis is
defensible.

By the by, as long as I'm criticizing IPv6 notation, let me express
just how poor a choice of separator character the colon is. The colon
separates the IPv4 address from a directory or port description only
slightly less often than the slash. Writing the parsers to handle an
IPv6 address as a drop in is a pain in the tail. Should have used a
dash, underscore or plus. Those are far more rarely used in
tokenization.

Regards,
Bill Herrin

Richard_Hartmann · November 20, 2010, 10:05am

I thought about that. Have a "one colon rule" that IPv6 addresses in
hexidecimal format have to include at least one colon somewhere. The
regex which picks that token out versus the other possibilities is
easy enough to write and so is the human rule: "Oh, it's got
hexidecimal digits and a colon in it. IPv6 address."

Even if this were feasible at this point, and it's not, this would
still make it hard for humans to detect an IPv6 address at a glance,
makes it impossible to quickly pick out any sections that are more
relevant at the moment and would hog the colon for all eternity,
blocking it for other uses. Also, this would make adding a port even
more cumbersome.

fd00:68::1 and fd:0068::1 mean different things now. The former means
fd00:0068::1 while the latter means 00fd:0068::1. I would instead have
them mean the same thing: fd00:6800::1. The single-colon separator
gets syntax but no semantics

I am not sure if this would actually be an advantage.

and the :: separator means "all middle
nibbles are zero" instead of "all middle two-byte components are
zero."

Putting the burden of parsing that on humans (and computers). Same as
modern compression algorithms are optimized to doing more work while
encoding and less work during decoding, it does not really make sense
to make it harder to understand an address while reading it for the
dubious gain of saving up to six colons.

I mean, when you think about it, the consequence that :: means "all
middle two-byte components are zero" is kinda weird.

It's a commonly accepted, well-defined convention to save humans
effort while not sacrificing readability. There are weirder things in
technology.

Anything you call out will be interpreted as special. The more you
call it out, the greater the expectation that the distinction is
important. That's human nature.

Pattern recognition is a central part of our intelligence, so yes,
it's human nature. This is not necessarily a bad thing.

While I agree that some of the delimitations are social, rather than
technical, it's still useful to have them. If this results in some
people not assigning their customers a /56 cause it looks funny, so be
it.

You've explained netmasks before to folks whose brains couldn't get
past the dots in the address. We all have.

I honestly think I never explained (as in, after I understood the
matter, myself) netmasks other than as a bit vector. Unless you mean
"write 255.255.255.0 in there cause that's what right for you".

And referring to IP address
notation as "dotted quads" just reinforces classful addressing
concepts so that folks assigning themselves 10/8 subnets damn near
always split on /16 and /24 boundaries.

And why shouldn't they? Unless they are a large ISP or similar, they
will have enough space for pretty much everything they ever need to
do. It's as good as anything and it allows people to be somewhat
familiar with this stuff.

Not everyone is an expert and that is fine. Personally, I have no
motivation whatsoever to _truly_ understand _everything_ that's
involved in today's wireless systems. Still, it's nice that I can use
them reliably without needing this level of involvement.

And even more efficiently when we don't have to repeatedly explain
that the mental model implied by the notation style is, in fact, not
how the technology actually works.

If the person can grasp what a bit vector is, they will understand. If
they don't, they will not understand it anyway and I won't waste time
trying to explain it in depth. At least as of right now, you are
giving those people some middle ground which allows them to have a
good working knowledge to use IPv6 reliably without needing this level
of involvement.

No sweat. When I shoot my mouth off, I expect to be challenged on the
remarks. Part of the fun lies in discovering whether the thesis is
defensible.

For at least a few rounds, I am usually good for that, too.
Personally, I think I answered the implicit question above, but it
made me re-asses and re-think my personal & professional opinion on
quite a few things and that's a Good Thing, from time to time.

By the by, as long as I'm criticizing IPv6 notation, let me express
just how poor a choice of separator character the colon is. The colon
separates the IPv4 address from a directory or port description only
slightly less often than the slash. Writing the parsers to handle an
IPv6 address as a drop in is a pain in the tail. Should have used a
dash, underscore or plus. Those are far more rarely used in
tokenization.

A dash is the character people use when there is not a standard. Look
at what they had to do for UUIDs to make them recognizable (which
worked out really well, especially the version encoding. I really like
their solution). Though they had the advantage that substring length
_really_ doesn't matter other than as a way to correctly distinguish
UUIDs from anything else.

I don't really like the colon either, but I can't think of an
alternative, either.

Richard

PS: Yes, I am fully aware that my complete email is moot anyway as the
IPv6 syntax will not change, ever. I wrote it for fun

Dan_Holme · November 20, 2010, 1:42pm

I like that, but maybe a chomp (although that might annoy some perl &
ruby people)... then maybe when all 2bytes are zeros and we expel them
from the address with a double colon, we should call that a fart.

Dan_Holme · November 20, 2010, 2:15pm

On a more serious note, apologies for throwing in more suggestions this late in the game. Especially now some documentation has been drafted, but has anybody thought of using 'munch' for 2bytes. It just seems to ring right for me.

William_Herrin · November 20, 2010, 5:12pm

I thought about that. Have a "one colon rule" that IPv6 addresses in
hexidecimal format have to include at least one colon somewhere. The
regex which picks that token out versus the other possibilities is
easy enough to write and so is the human rule: "Oh, it's got
hexidecimal digits and a colon in it. IPv6 address."

this would
still make it hard for humans to detect an IPv6 address at a glance,
makes it impossible to quickly pick out any sections that are more
relevant at the moment

Which is why you wouldn't conventionally remove the colons even though
the format would allow it. You might, however, move the colons to
highlight the delineations relevant to a particular address rather
than the meaningless two-byte separation.

For example:

260:abcde:123456:98::1

260 - IANA to ARIN, a /12
abcde - ARIN to ISP, a /32
123456 - ISP to customer, a /56
98 - customer subnet
::1 - LAN address

fd:1234567890:abcd::1

fd - ULA space
1234567890 - ULA global ID
abcd - user subnet
::1 - LAN address

Instead of this meaning-filled separation, we have:

260a:bcde:1234:5698::1

which doesn't tell us a single helpful thing about how that address is
organized. The only thing the colons do there is make it easier to
blindly transcribe, like the dashes in a CD license key.

and would hog the colon for all eternity,
blocking it for other uses.
Also, this would make adding a port even
more cumbersome.

I've written more than a few parsers. I think your concern here is overstated.

Anything you call out will be interpreted as special. The more you
call it out, the greater the expectation that the distinction is
important. That's human nature.

Pattern recognition is a central part of our intelligence, so yes,
it's human nature. This is not necessarily a bad thing.

The way you talk about something trains people how that thing works.
Train them poorly and it's your fault when their mistaken mental model
results in errors.

I mean, when you think about it, the consequence that :: means "all
middle two-byte components are zero" is kinda weird.

It's a commonly accepted, well-defined convention to save humans
effort while not sacrificing readability. There are weirder things in
technology.

I have no beef with the the notion of abbreviations. I'm just saying
this particular formulation is weird, a consequence of a poorly
thought-out notation format.

And even more efficiently when we don't have to repeatedly explain
that the mental model implied by the notation style is, in fact, not
how the technology actually works.

If the person can grasp what a bit vector is, they will understand. If
they don't, they will not understand it anyway and I won't waste time
trying to explain it in depth. At least as of right now, you are
giving those people some middle ground which allows them to have a
good working knowledge to use IPv6 reliably without needing this level
of involvement.

It helps if the notation style reminds them that they're dealing with
a bit vector. IPv6 is better about this than IPv4; at least the colons
aren't separating portions of the bit pattern expressed in base-10.
But it could be better. Fixed separations get folks thinking there's a
higher significance. Movable separations offers a constant reminder
that it is just a bit vector.

No sweat. When I shoot my mouth off, I expect to be challenged on the
remarks. Part of the fun lies in discovering whether the thesis is
defensible.

For at least a few rounds, I am usually good for that, too.
Personally, I think I answered the implicit question above, but it
made me re-asses and re-think my personal & professional opinion on
quite a few things and that's a Good Thing, from time to time.

A value I also find when I'm on the receiving end.

PS: Yes, I am fully aware that my complete email is moot anyway as the
IPv6 syntax will not change, ever. I wrote it for fun

Yep. However, there is one thing that could be done at this juncture:
intentionally don't name the two-byte groupings. And then make that a
part of the lesson plan: by the way folks, these groupings of four
characters in the IPv6 address intentionally have no name. That's
because the IPv6 address is a bit vector. The colons are only there to
make it easier to read and type; the groupings have no significance.

Regards,
Bill Herrin

Owen_DeLong · November 20, 2010, 10:15pm

fd00:68::1 and fd:0068::1 mean different things now. The former means
fd00:0068::1 while the latter means 00fd:0068::1. I would instead have
them mean the same thing: fd00:6800::1. The single-colon separator
gets syntax but no semantics

I am not sure if this would actually be an advantage.

It would actually be a huge disadvantage. Following the principle
of least surprise, whether you like it or not, the multiple colons
rule is useful for making IPv6 address human factors better.
Additionally, humans will tend to default to seeing the areas
between colons as number fields. As such, they expect them
to be right justified with leading zeros optional. Dropping trailing
zeroes will inevitably lead to more misinterpretations and errors
than keeping them.

Where this becomes unfortunate is when people make the
mistake of writing things like fd/8 or worse, fd::/8. Technically
both of these are not correct. fd/8 is simply invalid syntax.
The human eye will turn it into fd00::/8 because that's the only
possible logical meaning that makes any sense.

fd::/8 is worse because the human eye will turn it into fd00::/8
as that is again, the only sensible thing it could represent, while,
in fact, as written it means 0::/8.

For intuitive reading, things should always be written right
justified. No one will misinterpret fd00::/8 or 2001:0d00::/24.
Many will misinterpret fd::/8 and 2001:0d::/24.

and the :: separator means "all middle
nibbles are zero" instead of "all middle two-byte components are
zero."

Putting the burden of parsing that on humans (and computers). Same as
modern compression algorithms are optimized to doing more work while
encoding and less work during decoding, it does not really make sense
to make it harder to understand an address while reading it for the
dubious gain of saving up to six colons.

In reality the most you would gain from such a practice would be to save
two colons. The other 4 could already be eliminated by the :: in current
practice.

I mean, when you think about it, the consequence that :: means "all
middle two-byte components are zero" is kinda weird.

It's a commonly accepted, well-defined convention to save humans
effort while not sacrificing readability. There are weirder things in
technology.

I don't think it's all that weird and it's a major savings in writing
out IPv6 addresses and being able to read them (except in lists of
varying sized addresses (please, when dumping routing tables
and such, just keep the optional zeroes or give us a flag to choose).

In practice, the :: usually ends up being placed between the
network number and the host number for things with static
addresses and rarely appears in EUI-64 based addresses,
so, I don't see this as a problem.

Anything you call out will be interpreted as special. The more you
call it out, the greater the expectation that the distinction is
important. That's human nature.

Pattern recognition is a central part of our intelligence, so yes,
it's human nature. This is not necessarily a bad thing.

While I agree that some of the delimitations are social, rather than
technical, it's still useful to have them. If this results in some
people not assigning their customers a /56 cause it looks funny, so be
it.

I don't see a problem with people not assigning customers /56s so long
as they go in the correct direction and give /48s and not /60s or /64s.

You've explained netmasks before to folks whose brains couldn't get
past the dots in the address. We all have.

I honestly think I never explained (as in, after I understood the
matter, myself) netmasks other than as a bit vector. Unless you mean
"write 255.255.255.0 in there cause that's what right for you".

Then you are young and never had to deal with systems that didn't
know about bit-vector syntax. I have had to explain the translation
between bit-vector syntax (/n) and bit-field syntax (255.255.255.240)
to many people. It's easy when n is a multiple of 8. After that,
it can be quite hard for some mathematically challenged individuals
unfamiliar with binary and BCD to wrap their heads around.

And referring to IP address
notation as "dotted quads" just reinforces classful addressing
concepts so that folks assigning themselves 10/8 subnets damn near
always split on /16 and /24 boundaries.

And why shouldn't they? Unless they are a large ISP or similar, they
will have enough space for pretty much everything they ever need to
do. It's as good as anything and it allows people to be somewhat
familiar with this stuff.

Not everyone is an expert and that is fine. Personally, I have no
motivation whatsoever to _truly_ understand _everything_ that's
involved in today's wireless systems. Still, it's nice that I can use
them reliably without needing this level of involvement.

Removing bitmath from operations where possible is a good thing
that reduces outages caused by human factors. It's just good human
factors engineering.

We can't do so in IPv4, there aren't enough bits to do it.

We seek to do so in IPv6 with ARIN draft policy 2010-8 and
proposal 121.

And even more efficiently when we don't have to repeatedly explain
that the mental model implied by the notation style is, in fact, not
how the technology actually works.

If the person can grasp what a bit vector is, they will understand. If
they don't, they will not understand it anyway and I won't waste time
trying to explain it in depth. At least as of right now, you are
giving those people some middle ground which allows them to have a
good working knowledge to use IPv6 reliably without needing this level
of involvement.

Agreed.

No sweat. When I shoot my mouth off, I expect to be challenged on the
remarks. Part of the fun lies in discovering whether the thesis is
defensible.

For at least a few rounds, I am usually good for that, too.
Personally, I think I answered the implicit question above, but it
made me re-asses and re-think my personal & professional opinion on
quite a few things and that's a Good Thing, from time to time.

Should we all sing kumbayah now?

By the by, as long as I'm criticizing IPv6 notation, let me express
just how poor a choice of separator character the colon is. The colon
separates the IPv4 address from a directory or port description only
slightly less often than the slash. Writing the parsers to handle an
IPv6 address as a drop in is a pain in the tail. Should have used a
dash, underscore or plus. Those are far more rarely used in
tokenization.

A dash is the character people use when there is not a standard. Look
at what they had to do for UUIDs to make them recognizable (which
worked out really well, especially the version encoding. I really like
their solution). Though they had the advantage that substring length
_really_ doesn't matter other than as a way to correctly distinguish
UUIDs from anything else.

Underscore is a particularly poor choice for a variety of reasons,
not the least of which is the resulting legibility of things like:
2001_db8_f3ed__202_3/48

Dash is a poor choice because it becomes potentially problematic
to know whether your cisco is telling you that:

2001-0db8-5f03 is a MAC address or a /48 prefix.

+ would be interesting, but, I believe it also has overloaded
semantics and would make addresses look like a math
problem in hex anyway:

2001+db8+5f03++32/48

Is that 2001:db8:5f03::32/48 or is that 8cbd.43 (hex fraction approximated)

I'd say that loses on both human readability and parser ambiguity too,.

Other entertaining possible delimiters worthy of consideration might be:

Letter v 2001vdb8v5f03vv32/48 Pretty hard to read.
Letter I 2001Idb8I5f03II32/48 Also hard to read.
Hash # 2001#db8#5f03##32/48 This makes a hash of it, but, not too
hard to read and not ambiguous.
Splat * 2001*db8*5f03**32/48 Not bad, but, who wants to type this into
unix systems?
B-tick ` 2001`db8`5f03``32/48 Even MORE fun on a unix system.
F-tick ' 2001'db8'5f03''32/48 Yet more UNIX fun.
Quote " 2001"db8"5f03""32/48 See above

Basically, as I recall the earlier discussions of this and the IETF
arriving at the decision to use colon (:), it boiled down to the
simple fact that colon ( is the worst choice except for all the others.

Owen

Owen_DeLong · November 20, 2010, 10:20pm

I thought about that. Have a "one colon rule" that IPv6 addresses in
hexidecimal format have to include at least one colon somewhere. The
regex which picks that token out versus the other possibilities is
easy enough to write and so is the human rule: "Oh, it's got
hexidecimal digits and a colon in it. IPv6 address."

this would
still make it hard for humans to detect an IPv6 address at a glance,
makes it impossible to quickly pick out any sections that are more
relevant at the moment

Which is why you wouldn't conventionally remove the colons even though
the format would allow it. You might, however, move the colons to
highlight the delineations relevant to a particular address rather
than the meaningless two-byte separation.

How do you propose to get the router to regurgitate this?

For example:

260:abcde:123456:98::1

260 - IANA to ARIN, a /12
abcde - ARIN to ISP, a /32
123456 - ISP to customer, a /56
98 - customer subnet
::1 - LAN address

fd:1234567890:abcd::1

fd - ULA space
1234567890 - ULA global ID
abcd - user subnet
::1 - LAN address

From the data available in BGP today, this is a relatively arbitrary
positioning of the delimiters.

I would propose that for a proposed syntax to be worthy of
consideration it must be possible to reproduce it reliably
in an automated fashion.

Owen

Joel_Jaeggli · November 20, 2010, 11:05pm

Since I've been reading old drafts recently I think we can thank mike
o'dell for the term "routing goop". the problem of course is you can't
distinguish which part of the routing goop is signficant (to the humans)
unless you have an apriori mapping. otherwise all you have is some goop
and a mask which together are a route.

William_Herrin · November 21, 2010, 3:54pm

fd00:68::1 and fd:0068::1 mean different things now. The former means
fd00:0068::1 while the latter means 00fd:0068::1. I would instead have
them mean the same thing: fd00:6800::1. The single-colon separator
gets syntax but no semantics

I am not sure if this would actually be an advantage.

It would actually be a huge disadvantage. [...]

Where this becomes unfortunate is when people make the
mistake of writing things like fd/8 or worse, fd::/8. Technically
both of these are not correct. fd/8 is simply invalid syntax.
The human eye will turn it into fd00::/8 because that's the only
possible logical meaning that makes any sense.

fd::/8 is worse because the human eye will turn it into fd00::/8
as that is again, the only sensible thing it could represent, while,
in fact, as written it means 0::/8.

So... You just dissed my screed about IPv6 notation and then offered
two fantastic arguments why I'm right... Because in my version fd::/8
actually is the same as fd00::/8, which, as you rightly point out, is
exactly what a normal human being would naturally expect.

Imea nrea lly, what ifwe wrot eEng lish thew aywe writ eIPv 6add ress
es? Looks pretty stupid without a floating separator, doesn't it?

We've gone too far down the wrong path to change it now; colons are
going to separate every second byte in the v6 address. But from a
human factors perspective, floating colons would have been better.

From a computer parser perspective, a character other than a colon

would have been better because colons are already claimed for many for
other syntax elements that include an IP address, like the
address/port separator in a URL.

Making the jump in logic, it would help mitigate the errant design if
the two-byte groupings separated by the colons were intentionally and
formally not named. That fits a training scenario which reinforces the
idea that the colons are there for convenience but that there is
nothing special about those two byte groupings.

Dash is a poor choice because it becomes potentially problematic
to know whether your cisco is telling you that:

2001-0db8-5f03 is a MAC address or a /48 prefix.

Cisco's expression of a MAC address is wrong anyway. Correct notation
for a MAC address is separating each byte with a colon. This has
always been a PIA for me any time I need to copy a MAC address in to
or out of a Cisco config.

Basically, as I recall the earlier discussions of this and the IETF
arriving at the decision to use colon (:), it boiled down to the
simple fact that colon ( is the worst choice except for all the others.

Could have stuck with dot and forgone "::192.168.1.1". Or replaced the
v4 dot with a dash in that scenario. Could have gone with comma and
quoted the address in CSV files like you do for any text value that
isn't trivial, instead of bracketing it in much more commonly used
URLs.

For example:

260:abcde:123456:98::1

260 - IANA to ARIN, a /12
abcde - ARIN to ISP, a /32
123456 - ISP to customer, a /56
98 - customer subnet
::1 - LAN address

How do you propose to get the router to regurgitate this?

I don't. The colons float. If the address was learned dynamically, the
router can regurgitate it any way it wants.

The question leads me to recall a fancy version of traceroute I once
used. In addition to looking up the PTR record for each hop, it also
looked up the org and AS number currently associated. If users found
it valuable to have the router present variable colon placement, it's
a doable albeit complex computing task.

Regards,
Bill Herrin

Joel_Jaeggli · November 21, 2010, 4:40pm

The benefits of hindsight are myriad...

The uri schema is contemporaneous with rfc 1883 as is a lot of formative
work in a lot of areas 1992-1994 range.

There is a lot of assumption on the part of ipv6 that the use of ipv6
literals in uri's would be a rather infrequent occurrence, given how
infrequent it is in ipv4 it would seem to be a reasonable assumption.

Valdis_Kletnieks · November 21, 2010, 6:42pm

What do you do when ARIN gives Tier1 a /24, and Tier1 gives Billy Bob's
Bait, Tackle, and Internet a /40, and Billy Bob gives one of their customers a /56?

Owen_DeLong · November 21, 2010, 10:15pm

fd00:68::1 and fd:0068::1 mean different things now. The former means
fd00:0068::1 while the latter means 00fd:0068::1. I would instead have
them mean the same thing: fd00:6800::1. The single-colon separator
gets syntax but no semantics

I am not sure if this would actually be an advantage.

It would actually be a huge disadvantage. [...]

Where this becomes unfortunate is when people make the
mistake of writing things like fd/8 or worse, fd::/8. Technically
both of these are not correct. fd/8 is simply invalid syntax.
The human eye will turn it into fd00::/8 because that's the only
possible logical meaning that makes any sense.

fd::/8 is worse because the human eye will turn it into fd00::/8
as that is again, the only sensible thing it could represent, while,
in fact, as written it means 0::/8.

So... You just dissed my screed about IPv6 notation and then offered
two fantastic arguments why I'm right... Because in my version fd::/8
actually is the same as fd00::/8, which, as you rightly point out, is
exactly what a normal human being would naturally expect.

That's not a reason you're correct, it's a recognition of one of the
warts in the current system and a statement that on the rare
occasion when people are writing IPv6 addresses, they need
to do so with care. So long as one writes the IPv6 address
correctly, it is not hard to read.

There is no problem with the understanding of fd00::/8 for
both humans and machines.

In fact, fd::/8 would be interpreted, I estimate, by approximately
80% of people as fd00::/8 and by the other 20% who are used
to working with numbers and computers as 00fd::/8 until they
realized that the person responsible for that scrawl must have
meant fd00::/8. Thus, it is an ambiguous representation open
to convenient misinterpretation.

The problem with your idea comes when we move on to
fd::/16. In the current system, this is a valid syntax for
00fd::/16. In your system, would this mean 00fd::/16 or
would it mean fd00::/16? Both are equally valid as I
see it. Certainly there is the likelihood for much confusion
even if you have rules that cover it.

Imea nrea lly, what ifwe wrot eEng lish thew aywe writ eIPv 6add ress
es? Looks pretty stupid without a floating separator, doesn't it?

If this were prose, sure. It isn't. It's an addressing scheme. I mean,
really, we don't question 99999-1520 or 408-555-1212 which
are much more like what we're talking about.

In fact, it would look pretty weird to most people if we started writing
951-21-42-33 (or I bet they wouldn't expect that was a zip code in
any case). Similarly, if we start placing the separators in arbitrary
places in phone numbers, people get confused.

We've gone too far down the wrong path to change it now; colons are
going to separate every second byte in the v6 address. But from a
human factors perspective, floating colons would have been better.

I still disagree. While I noted the one pathology with the current
system, that same pathology is present with floating colons
and there are others which I also pointed out (difficulty in
reproducing the "correct" placement of the floating colons in
automated output, for example.

From a computer parser perspective, a character other than a colon

would have been better because colons are already claimed for many for
other syntax elements that include an IP address, like the
address/port separator in a URL.

The syntax for handling this was already present in IPv4 and is easily
adapted to the problem in IPv6. Simply wrap the IPv6 address in
square brackets (e.g. [2001:db8:feed::cafe]:80 is the ipv6
address 2001:db8:feed::cafe on port 80).

Making the jump in logic, it would help mitigate the errant design if
the two-byte groupings separated by the colons were intentionally and
formally not named. That fits a training scenario which reinforces the
idea that the colons are there for convenience but that there is
nothing special about those two byte groupings.

Really, there's no advantage whatsoever to this. All it does is make
talking about address structure more complicated. Having a universal
name will reduce the occurrence of local arbitrary naming which I
believe is a useful practice.

Dash is a poor choice because it becomes potentially problematic
to know whether your cisco is telling you that:

2001-0db8-5f03 is a MAC address or a /48 prefix.

Cisco's expression of a MAC address is wrong anyway. Correct notation
for a MAC address is separating each byte with a colon. This has
always been a PIA for me any time I need to copy a MAC address in to
or out of a Cisco config.

Doesn't matter... It's widespread and Cisco isn't the only one to use it.
As such, doing this to IPv6 addresses is a recipe for pain.

Basically, as I recall the earlier discussions of this and the IETF
arriving at the decision to use colon (:), it boiled down to the
simple fact that colon ( is the worst choice except for all the others.

Could have stuck with dot and forgone "::192.168.1.1". Or replaced the
v4 dot with a dash in that scenario. Could have gone with comma and
quoted the address in CSV files like you do for any text value that
isn't trivial, instead of bracketing it in much more commonly used
URLs.

We did forego ::192.168.1.1. However, we still have ::ffff:192.168.1.1
and for good reason. This is a useful construct for allowing humans
to see in log files that an IPv6-aware application on a dual-stack
machine accepted an IPv4 connection on an IPv6 socket.

For example:

260:abcde:123456:98::1

260 - IANA to ARIN, a /12
abcde - ARIN to ISP, a /32
123456 - ISP to customer, a /56
98 - customer subnet
::1 - LAN address

How do you propose to get the router to regurgitate this?

I don't. The colons float. If the address was learned dynamically, the
router can regurgitate it any way it wants.

Yeah, because it's always good for human factors when the output
showing an address bears little or no resemblance to the way the
user expects it to be written. NOT!

The question leads me to recall a fancy version of traceroute I once
used. In addition to looking up the PTR record for each hop, it also
looked up the org and AS number currently associated. If users found
it valuable to have the router present variable colon placement, it's
a doable albeit complex computing task.

Which, IMO, is a good reason the current system is superior. Fewer
surprises==better human factors engineering.

Owen

William_Herrin · November 21, 2010, 10:50pm

There is a lot of assumption on the part of ipv6 that the use of ipv6
literals in uri's would be a rather infrequent occurrence, given how
infrequent it is in ipv4 it would seem to be a reasonable assumption.

Joel,

Looks like an ass-u-me. If you think the use if IPv4 addresses in URLs
is infrequent, it's mostly "u." Get out in the field some time.

I've yet to work for a non-ISP that (before I arrived) maintained
their internal DNS consistently vice using address literals. If the
company was small, they didn't really know how to operate a DNS
server. If large, the DNS ops were too inaccessible to be consulted on
things that weren't also being reviewed by PR for release to the
general public.

In fact, in one project I occasionally work on, the team is
-frequently- told by the DNS op for the NIPR-based DNS server how
bothered he is by the lookup count, so won't we please place commonly
used Internet names in our /etc/hosts. My jaw dropped the first time I
heard that one.

That server op is the kind of guy we're asking to understand that
there's nothing special about the two bytes between the colons in the
IPv6 address. He's gonna be trouble.

260:abcde:123456:98::1

260 - IANA to ARIN, a /12
abcde - ARIN to ISP, a /32
123456 - ISP to customer, a /56
98 - customer subnet
::1 - LAN address

What do you do when ARIN gives Tier1 a /24, and Tier1 gives Billy Bob's
Bait, Tackle, and Internet a /40, and Billy Bob gives one of their customers a /56?

Whatever you want to do. That's the point of optional/movable separators.

An option w/ movable separators:

260:abc:1234:9876:fe::1

Actual IPv6 standard (and also allowed w/ movable separators):

260a:bc12:3498:76fe::1

Imea nrea lly, what ifwe wrot eEng lish thew aywe writ eIPv 6add ress
es? Looks pretty stupid without a floating separator, doesn't it?

If this were prose, sure. It isn't. It's an addressing scheme. I mean,
really, we don't question 99999-1520 or 408-555-1212 which
are much more like what we're talking about.

In fact, it would look pretty weird to most people if we started writing
951-21-42-33 (or I bet they wouldn't expect that was a zip code in
any case). Similarly, if we start placing the separators in arbitrary
places in phone numbers, people get confused.

That would be a more compelling argument if it accurately described
phone number notation. It doesn't. "+44 121 410 5228," for example, is
the phone number for parking services at Heathrow airport, exactly as
described on Heathrow: Welcome to Heathrow Airport | Heathrow "contact us" page. No
dashes at all, and not 10 digits.

And BTW, 408-555-1212 isn't arbitrarily separated. Each component has
a specific meaning. Long distance region 408, telco reserved prefix
555, long distance information 1212.

The Zip code's components also have meaning. The left 5 digits
indicate the specific post office and the right 4 digits usually
specify the internal box number used for sorting the mail.

Even IPv4's dot separators were placed in meaningful locations in the
original Classful design. The network address was always the whole
content to the left of one of the dots while the host address was
always the whole content to the right. Unless the network was complex
enough to have a subnet address in the middle, still confined by the
dots. It's an anachronism now, but the separators were originally
important.

IPv6 is one of very few addressing schemes in which the separators
intentionally have no greater meaning within the protocol or its use.
They're just there. If we want folks to understand that difference
from their normal experience with addressing notations, we'll have to
call attention to it by, for example, leaving the byte groupings
formally unnamed.

Dash is a poor choice because it becomes potentially problematic
to know whether your cisco is telling you that:
2001-0db8-5f03 is a MAC address or a /48 prefix.

Cisco's expression of a MAC address is wrong anyway. Correct notation
for a MAC address is separating each byte with a colon.

Doesn't matter... It's widespread and Cisco isn't the only one to use it.

Just for my own edification, who else besides Cisco do you know who
uses that notation for MAC addresses? I want some convincing before
I'll accept the claim that it's widespread.

-Bill

George_Bonser · November 21, 2010, 11:14pm

An option w/ movable separators:

260:abc:1234:9876:fe::1

Actual IPv6 standard (and also allowed w/ movable separators):

260a:bc12:3498:76fe::1

The problem with movable separators is in handling zeros. If the
separators are a known distance apart, zeros can be deduced. The
example above has only one zero. Imagine it were a different address:

260a:0:8:65::1

Now with movable separators the the zeros would be explicit because you
don't know how far apart the colons are n the address. In your example,
it would become:

260:a00:0000:0800:65::1 because there is no way to tell from a movable
colon address how many places the colon was moved and how many zeros
there are between them. You can move colons as long as zeros are
explicit but you must have fixed colons if zeros are implicit.
Otherwise there is no way to deduce where the zeros go from simply
parsing the address.