NAT64 and matching identities

It's looking more and more like NAT64 will be in our future. One of the valid concerns for NAT64 - much like NAT44 - is being able to determine the identity of a given user through the NAT at a given point in time.
How feasible this is depends on how robust/scalable $XYZ's translation logging capabilities are, and possibly how easily that data can be matched against a source of identify information, such as RADIUS accounting logs, DHCP lease logs, etc.

Other IPv6 transition mechanisms appear to be no less thorny than NAT64 for a variety of reasons.

I'm curious to see how others are planning to tackle (or already have tacked) this issue. Discussing vendor-specific solutions is fine, but I think keeping things as platform/vendor agnostic as possible for the time being would allow this thread to be more beneficial to a wider audience.

The floor is open...

jms

For logging, the following IETF Behave WG drafts are nearly complete. The IPFIX version will be updated soon (I hope) to more closely match the SYSLOG based one. They both will match the new NAT MIB document, also listed below:

http://datatracker.ietf.org/doc/draft-ietf-behave-ipfix-nat-logging/

http://datatracker.ietf.org/doc/draft-ietf-behave-syslog-nat-logging/

http://datatracker.ietf.org/doc/draft-ietf-behave-nat-mib/

There is also work being done on reducing log volumes by bulk allocation of ports. The following drafts will be combined to meet a Sunset WG milestone:

http://datatracker.ietf.org/doc/draft-chen-sunset4-cgn-port-allocation/

http://datatracker.ietf.org/doc/draft-tsou-behave-natx4-log-reduction/

http://datatracker.ietf.org/doc/draft-donley-behave-deterministic-cgn/

Tom Taylor

MSOs logging subscriber flows, what could possibly go wrong?

Drive slow, like a Sandvine under load,
Paul Wall

It's looking more and more like NAT64 will be in our future. One of the
valid concerns for NAT64 - much like NAT44 - is being able to determine
the identity of a given user through the NAT at a given point in time.

Bulk port allocation. Your NAT logs then approximate your DHCP (or
whatever) logs in size and scope.

Unless you mean to use it to front a web service. Then just use
x-forwarded-for, and make sure your logs and log parsers can handle it.
Might want to write a correlation script.

How feasible this is depends on how robust/scalable $XYZ's translation
logging capabilities are, and possibly how easily that data can be
matched
against a source of identify information, such as RADIUS accounting logs,
DHCP lease logs, etc.

Ask the vendors; it took them a while, but they all have techniques for
reducing logs.

Other IPv6 transition mechanisms appear to be no less thorny than NAT64
for a variety of reasons.

Yes; see rfc7021.

Once you've deployed it, an experience report at a NANOG meeting would be
welcome.

Lee

Some of us who worked on the NAT64/DNS64 combination were content that
it was a long way from the perfect solution. The idea I at least had
was to get something that mostly worked most of the time, and was
simple enough that anyone could basically understand it.
Nevertheless, I have to admit that it's a pig.

That piggishness was not something I wanted to get rid of. I thought
(and still think) that if the transition mechanisms are awful enough,
it will encourage moving things to v6 for real so that we can get rid
of the kludges. Perhaps this is wishful thinking, however.

In any case, I'm sorry to have contributed in some little way to this
headache of yours.

Best,

A

It's looking more and more like NAT64 will be in our future.
One of the valid concerns for NAT64 - much like NAT44 - is being
able to determine the identity of a given user through the NAT
at a given point in time.
How feasible this is depends on how robust/scalable $XYZ's
translation logging capabilities are, and possibly how easily that
data can be matched against a source of identify information,
such as RADIUS accounting logs, DHCP lease logs, etc.

... snip ...

We implemented a product around this. What we found in doing
so was that a) you need to use port-block allocation to make it feasible
(cannot do unbounded NATP where every flow gets its own port),
That AAA works well when the NAT is a gateway device, and that
Otherwise DHCP works ok, and syslog is the fallback. All devices
Supported one of those three.

We also found there was a need for IPV6 identification (e.g. some
customers used DNS reverse lookup in ipv4 to find the ID of a user
for e.g. single-sign-on type solutions, and this no longer worked
in a NAT44/NAT64/IPv6 environment.

We found there was a need for both real-time (e.g. query
who is this right now, e.g. sign-on), and after the fact (who
had this @ this time).

The general purpose coordinates we called 'session qualifiers', and
we found that sometimes it included VLAN or MPLS or other
tunnels.

Let me know if u want more info and I can follow up offline.

Speaking as one of the co-authors of RFC 6052, 6144, and 6145...

I'm actually not sorry. The predecessor to RFCs 6052/6144/6145/6146/6147 was NAT-PT, which didn't work very well in part due to a nasty coupling (see RFC 4966). It's pretty straightforward to insert an IPv4 address into a specified IPv6 prefix (RFC 6052), and use that to statelessly translate between a IPv4 address and an RFC 6052 address (RFC 6145), or to statefully translate a random IPv6 address into an IPv4 space much in the way IPv4/IPv4 translation works (RFC 6146). What is hard is statefully translating from IPv4 to a generic IPv6 address - its hard to compress 128 bits of information into 32 bits. NAT-PT does it by having the DNS lookup temporarily assign an IPv4 address to the IPv6 device and inform the translator of the translation. http://tools.ietf.org/html/draft-anderson-siit-dc (which Tore didn't, to my knowledge, try to get turned into an RFC, although I'd be willing to discuss that with v6ops) does it by pre-assigning address pairs, enabling an IPv6-only domain to be accessed from an IPv4-only domain by a defined translation between the two for a small set of servers.

I'm all for helping people to transition. Where I get a little crazy is when the so-called transition tool makes them comfortable enough that they think they don't need to. What I expect to see in the IETF over the coming few years - and which I see in detail coming from several <nationality> competitors and their <nationality> network customers now - is a series of ideas of the form "but people with ancient IPv4-only hosts are having trouble with the IPv6 network; let's do this *temporary* patch to ease their pain". I submit that the best way to ease their pain is to upgrade their hosts. They will have to deal with it at some point.

It depends on what direction your are translating to:

IPv6-only host to IPv4 Internet: This isn't a problem if you are dual-stack at the host, but if you really do have ip6 only hosts, you aren't looking at any requirement that is different than LSN44 or providing a IPv6 tunnel broker service (like he.net). Since NAT64 is necessarily predicated by a DNS64 operation and you know who you gave an IP address to because they logged in (in some fashion) so you could bill them, you can log {subID,src_ip6,xlat_ip4:port,dst_ip4:port,fqdn} using syslog or ipfix (in as little as one message, depending on the AAA and IPAM architecture) and invest in log servers. Port block allocation and deterministic schemes are possible here as well, but really, the only way to know you aren't going to be surprised by a lost or inaccurate data set under subpoena is to just log everything and write it off as a statutory expense.

There is obviously a long tail of ip4 destinations, but nearly all of 500 of the Alexa global 500 have ip6 listeners, so the majority of your connections from ip6 only hosts should be leaving your network without NAT and if they aren't, you should figure out why as part of reassessing the problem.

IPv6 Internet to IPv4-only host: Just do LB64 with an IP proxy. Most commercial SLB/ADC vendors do this today and offer varying degrees of ALG to fix-up protocols that have multiple channels. Your server doesn't need to know that there is a IPv6 portion of the connection unless they are doing something absurd like trying to initiate connections to IPv6 only hosts, and the ADC will help you deal with it as well. Conveying the xlat information is protocol specific - HTTP and SIP are super easy, since that same ADC will do header inserts with the original client ip, others might not be, but by not having dual-stack applications, you are committing yourself to the tedium of protocol by protocol fix-ups. You can help out that particular headache by using name lookups instead of address lookups (getaddrinfo instead of gethostbyaddr on POSIX systems)

Much of our initial deployment will be dual-stack, however I also want to plan for situations where we won't have enough v4 addresses to dual-stack (or we reach a point where we need to hold some of our routable v4 space in reserve for transition mechanisms), plus dual-stack on its own provides no incentive for users to migrate completely to v6.

That said, I need to plan for the eventuality of v6-only hosts being able to reach the v4 Internet. While many of the Alexa global 500 sites have some sort of v6 capability today and the percentage of global Internet traffic that is v6 is increasing every day, the need for reachability to what remains of the v4 Internet will not go away any time soon.

jms

Leaving out stuff . . .

Yo Lee!

Right, weighted by DNS queries.
Compare to http://www.vyncke.org/ipv6status/detailed.php?country=us
and http://www.employees.org/~dwing/aaaa-stats/

Not equivalent to "nearly all of Alexa 500."

Lee

Yo Lee!

> >There is obviously a long tail of ip4 destinations, but nearly all
> >of 500 of the Alexa global 500 have ip6 listeners,
>
> Do you have a data source for that? I see no indication of IPv6
> listeners on 85% of the top sites.

A slightly different metric, 44% of USA content available on IPv6:

Cisco IPv6 Lab: IPv6 Deployment

I'm puzzled; I have native v6 connectivity
to 6lab.cisco.com according to traceroute6
output, and yet the page says I'm connecting
to it via IPv4. :frowning:
So, I did some poking; it seems 6lab.cisco.com
doesn't have working IPv6 for their stats system,
which makes me wonder how accurate the data
from it is likely to be:

mpetach@mintyHP:~> telnet -6 6lab.cisco.com 80
Trying 2001:420:4420:101:0:c:15c0:4664...
Connected to 6lab.cisco.com.
Escape character is '^]'.
GET / HTTP/1.0

HTTP/1.1 302 Found
Server: Apache/2.2.16 (Debian)
Location: Cisco 6lab IPv6 stats widget
Cache-Control: max-age=1
Expires: Thu, 21 Nov 2013 19:03:30 GMT
Vary: Accept-Encoding
Content-Length: 295
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="Cisco 6lab IPv6 stats widget
">here</a>.</p>
<hr>
<address>Apache/2.2.16 (Debian) Server at 6lab-stats.com Port 80</address>
</body></html>
Connection closed by foreign host.
mpetach@mintyHP:~> telnet -6 6lab-stats.com 80
Trying 2001:420:81:101:0:c:15c0:4664...
telnet: Unable to connect to remote host: Connection timed out
mpetach@mintyHP:~>
mpetach@mintyHP:~> ping6 6lab-stats.com
PING 6lab-stats.com(2001:420:81:101:0:c:15c0:4664) 56 data bytes
^C
--- 6lab-stats.com ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9071ms

mpetach@mintyHP:~>

It was a stale DNS entry. Now fixed (modulo TTLs and such), thanks.

That said, your troubleshooting was troubleshooting a different
problem, not your browser's inability to retrieve the page. The way
the browser sends the request is something like this (note the HTTP
version and the host header):

ayourtch@mcmini:~$ telnet -6 6lab.cisco.com 80
Trying 2001:420:4420:101:0:c:15c0:4664...
Connected to 6lab.cisco.com.
Escape character is '^]'.
GET / HTTP/1.1
Host: 6lab.cisco.com

HTTP/1.1 302 Found
Server: Apache/2.2.16 (Debian)
X-Frame-Options: SAMEORIGIN
Location: http://6lab.cisco.com/index.php
Cache-Control: max-age=1
Expires: Thu, 21 Nov 2013 19:38:32 GMT
Vary: Accept-Encoding
Content-Length: 295
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a
href="http://6lab.cisco.com/index.php">here</a>.</p>
<hr>
<address>Apache/2.2.16 (Debian) Server at 6lab.cisco.com Port 80</address>
</body></html>

Anyway, the fact that you were still able to retrieve the original
reply with a redirect, makes me think that there could be a PMTUD
problem somewhere inbetween 6lab and yourself for retrieving the
larger content ...

If you tell your client address I will be able to test this theory. Or
you can quickly tweak your local interface value to 1280 and if that
works, then tell me your client address so i could debug from the
other side.

--a

Lee Howard wrote:
...

>> >There is obviously a long tail of ip4 destinations, but nearly all
>> >of 500 of the Alexa global 500 have ip6 listeners,
>>
>> Do you have a data source for that? I see no indication of IPv6
>> listeners on 85% of the top sites.
>
>A slightly different metric, 44% of USA content available on IPv6:
>
>Cisco IPv6 Lab: IPv6 Deployment

Right, weighted by DNS queries.
Compare to http://www.vyncke.org/ipv6status/detailed.php?country=us
and AAAA and IPv6 Connectivity statistics

Not equivalent to "nearly all of Alexa 500."

Using a derivative of Dan Wings code from a couple of years back I get:

The top 5 websites: AAAA records and IPv6 connectivity
           count with A: 5 (100.000%)
        count with AAAA: 4 ( 80.000%)
Of the 4 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 4 (100.000%)

The top 10 websites: AAAA records and IPv6 connectivity
           count with A: 10 (100.000%)
        count with AAAA: 6 ( 60.000%)
Of the 6 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 6 (100.000%)

The top 25 websites: AAAA records and IPv6 connectivity
           count with A: 25 (100.000%)
        count with AAAA: 10 ( 40.000%)
Of the 10 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 10 (100.000%)

The top 50 websites: AAAA records and IPv6 connectivity
           count with A: 50 (100.000%)
        count with AAAA: 21 ( 42.000%)
Of the 21 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 21 (100.000%)

The top 100 websites: AAAA records and IPv6 connectivity
           count with A: 98 ( 98.000%)
        count with AAAA: 30 ( 30.000%)
Of the 30 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 30 (100.000%)

The top 250 websites: AAAA records and IPv6 connectivity
           count with A: 248 ( 99.200%)
        count with AAAA: 56 ( 22.400%)
Of the 56 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 56 (100.000%)

The top 500 websites: AAAA records and IPv6 connectivity
           count with A: 494 ( 98.800%)
        count with AAAA: 91 ( 18.200%)
Of the 91 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 91 (100.000%)

The top 1000 websites: AAAA records and IPv6 connectivity
           count with A: 990 ( 99.000%)
        count with AAAA: 132 ( 13.200%)
Of the 132 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 132 (100.000%)

The top 2500 websites: AAAA records and IPv6 connectivity
           count with A: 2479 ( 99.160%)
        count with AAAA: 216 ( 8.640%)
Of the 216 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 214 ( 99.074%)

The top 5000 websites: AAAA records and IPv6 connectivity
           count with A: 4959 ( 99.220%)
        count with AAAA: 354 ( 7.083%)
Of the 354 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 347 ( 98.023%)

The top 10000 websites: AAAA records and IPv6 connectivity
           count with A: 9918 ( 99.230%)
        count with AAAA: 600 ( 6.003%)
Of the 600 hosts with AAAA records, testing connectivity to TCP/80:
     count with IPv6 ok: 575 ( 95.833%)

Original code developed by dwing@employees.org.
manual run by tony on arabian.tndh.net using ./IPv6-check .
on Fri Nov 22 09:48:17 PST 2013 (elapsed: 00:08:33, t: 15).
Top 10000 websites based on Alexa top-1m.csv.

I question how one can have a top 100 website without an A record.

I am inclined to believe there is a bug in there somewhere.

Owen

Statistics whoopsie, or are there actually 2 sites in the top100
that are IPv6-only?

IN CNAME ? or is that being accounted for.

It would be way more than 2 if it were CNAME, methinks.

Owen

The only thing it explicitly strips out are dotted-quads, which don't occur
until # 4255. The code makes five passes at getaddrinfo() for IPv4 before
giving up, and then it checks for a leading www and if that exists it strips
it off and does the 5 tries loop again, then later the same process for
IPv6. For the top 100 run:
akamaihd.net no IPv4 no IPv6
bp.blogspot.com no IPv4 no IPv6

FWIW :::
Dotted-quad's in the top 10,000
4255,92.242.195.24
4665,1.1.1.1
5079,92.242.195.231
6130,1.254.254.254
9518,208.98.30.70

whois 92.242.195.24

...
netname: Respina
descr: BroadBand IP Pool
country: IR
...
route: 92.242.195.0/24

Respina BroadBand IP Pool in the top 100,000
4255,92.242.195.24
5079,92.242.195.231
10059,92.242.195.233
23912,92.242.195.30
31520,92.242.195.111
35867,92.242.195.235
95233,92.242.195.129