Carrier Grade NAT

Tony_Wicks · July 29, 2014, 9:28pm

OK, as someone with experience running CGNAT to fixed broadband customers in
general, here are a few answers to common questions. This is based on the
setup I use which is CGNAT is done on the BNG (Cisco ASR1K6).

1. APNIC ran out of IPv4 a couple of years ago, so unless you want to pay
USD $10+ per IP then CGNAT is the only option.
2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
thing, perhaps one day, but certainly not today (I really hate clueless
people who shout to the hills that IPv6 is the "solution" for today's
internet access)
3. 99.99% of customers don't notice they are transiting CGNAT, it just
works.
4. You need to log NAT translations for LI purposes. (IP source/destination,
Port source/destination, time) Surprisingly this does not produce that big a
database burden. However as Cisco's Netflow NAT logging is utterly useless
you need to use syslog and this ramps up the ASR CPU a bit.
5. NAT translation timeouts are important, XBOX and PlayStation suck.
6. 10,000 customers= approximately 200,000 active translations and 1-2
/24's to be comfortable
7. CGNAT protects your customers from all sorts of nasty's like small DDOS
attacks and attacks on their crappy CPE
8. DDOS on CGNAT pool IP's are a pain in the rear and happen often.
9. In New Zealand we are not a state of the USA so spammed DCMA emails can
be redirected to /dev/null. If a rights holder wishes to have a potential
violation investigated (translation logs) they need to pay a $25 fee, so in
general they don't bother. Police need a search warrant so they generally
only ask for user info when they actually can justify it, so it's not a big
overhead.
10. It is not uncommon for people who run some game servers and websites
(like banks) to be completely clueless/confused about cgnat and randomly
block IP's as large numbers of users connect from single IP. This is not a
big issue in practice.

cheers

Chris_Boyd1 · July 29, 2014, 10:08pm

True, but there is a difference in this case, since I could probably find a way to do discovery of the warrant/subpoena that was delivered to the ISP--assuming it's not an NSL. I would assume that going into court with evidence of the warrant/subpoena would be sufficient to grant standing. Or the notice of intercepted communications that I've seen a few times would work too.

In $DAYJOB, we're all colo/cloud, so the stuff we get specifies a specific date. Have not come across any that specify a few seconds of time as another poster noted.

In any case IANAL, so who knows until the cases start showing up on the dockets.....

--Chris

Lee_Howard2 · July 29, 2014, 10:19pm

Thanks for sharing your experience; it's very unusual to get the
perspective of an operator running CGN (on a broadband ISP; wireless has
always had it).

OK, as someone with experience running CGNAT to fixed broadband customers
in
general, here are a few answers to common questions. This is based on the
setup I use which is CGNAT is done on the BNG (Cisco ASR1K6).

1. APNIC ran out of IPv4 a couple of years ago, so unless you want to pay
USD $10+ per IP then CGNAT is the only option.

Eh, a bit over US$7 now, but whatever. Higher in APNIC.

2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
thing, perhaps one day, but certainly not today (I really hate clueless
people who shout to the hills that IPv6 is the "solution" for today's
internet access)

It's viable, it's just not a substitute for IPv4 yet.
Except for specific scenarios. For instance, you mention gaming below; if
two users are playing on Xbox ONE, they can use IPv6 and they're off the
CGN. Or if a bank has blacklisted an IPv4 address on the CGN, but the
bank is dual-stack, some users can still get there.
Of course, that snowballs.

3. 99.99% of customers don't notice they are transiting CGNAT, it just
works.

Surprised it's that high.

4. You need to log NAT translations for LI purposes. (IP
source/destination,
Port source/destination, time) Surprisingly this does not produce that
big a
database burden. However as Cisco's Netflow NAT logging is utterly useless
you need to use syslog and this ramps up the ASR CPU a bit.

Can you quantify?
The log entry has to be at least:
32 bits source address
16 bits source port
32 bits destination address
16 bits destination port
64 bits? timestamp

_Matt_Palmer · July 29, 2014, 10:42pm

Thanks for sharing your experience; it's very unusual to get the
perspective of an operator running CGN (on a broadband ISP; wireless has
always had it).

>OK, as someone with experience running CGNAT to fixed broadband customers
>in
>general, here are a few answers to common questions. This is based on the
>setup I use which is CGNAT is done on the BNG (Cisco ASR1K6).
>
>1. APNIC ran out of IPv4 a couple of years ago, so unless you want to pay
>USD $10+ per IP then CGNAT is the only option.

Eh, a bit over US$7 now, but whatever. Higher in APNIC.

>2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
>thing, perhaps one day, but certainly not today (I really hate clueless
>people who shout to the hills that IPv6 is the "solution" for today's
>internet access)

It's viable, it's just not a substitute for IPv4 yet.
Except for specific scenarios. For instance, you mention gaming below; if
two users are playing on Xbox ONE, they can use IPv6 and they're off the
CGN. Or if a bank has blacklisted an IPv4 address on the CGN, but the
bank is dual-stack, some users can still get there.
Of course, that snowballs.

>3. 99.99% of customers don't notice they are transiting CGNAT, it just
>works.

Surprised it's that high.

>4. You need to log NAT translations for LI purposes. (IP
>source/destination,
>Port source/destination, time) Surprisingly this does not produce that
>big a
>database burden. However as Cisco's Netflow NAT logging is utterly useless
>you need to use syslog and this ramps up the ASR CPU a bit.

Can you quantify?
The log entry has to be at least:
32 bits source address
16 bits source port
32 bits destination address
16 bits destination port
64 bits? timestamp
---
160 bits = 20 bytes per flow
You have to log the end of the flow, too, right? Another 20 bytes?
40 bytes per flow. Not including syslog severity and message text.

You can get it down a bit smaller, if you're OK with having to find the
records again to update them at the end of the connection (either TCP FIN,
or UDP mapping timeout):

32 bits NAT endpoint ip
16 bits NAT endpoint port
32 bits dest ip
16 bits dest port
32 bits start timestamp
32 bits end timestamp
16 bits customer ID (you could store the customer's internal IP, but that's
bigger)

That's 22 bytes per flow (maybe 24 if you're planning on having more than
64ki customers in your CGNAT's lifetime).

You could drop the timestamps by another 16 bits each if you don't mind
reducing granularity (if you guarantee you won't reuse a given IP/port pair
for, say, 30 seconds, you can define the timestamp to be, say, 15 second
increments) and/or changing the epoch -- 15 second granularity + rolling
epoch every week => 16 bit timestamps do just fine.

As I recall, a site like cnn.com opens 80 flows, so 3200 bytes of log data.
If, as you say in #6, 10,000 customers = 200,000 active translations,
that's 8,000,000 bytes of syslog. . . per second? Not sure if "active"
indicates how fast those sessions churn.
180 days of log retention would be. . . 124TB of data. Per 10,000 users.

Of course, getting anything back *out* of that again in any sort of
reasonable timeframe would be... optimistic. I suppose if you're storing it
all in hadoop you can map/reduce your way out of trouble, but that's going
to mean a lot of equipment sitting around doing nothing for 99.99% of the
time. Perhaps mine litecoin between searches?

>7. CGNAT protects your customers from all sorts of nasty's like small DDOS
>attacks and attacks on their crappy CPE
>8. DDOS on CGNAT pool IP's are a pain in the rear and happen often.

Between #7 and #8, do they balance out?

I'd doubt it. A customer getting DDoS'd counts against their usage limit;
you can't bill traffic pointed at a CGNAT address against any particular
customer. <grin>

- Matt

_Matt_Palmer · July 29, 2014, 10:53pm

2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
thing, perhaps one day, but certainly not today (I really hate clueless
people who shout to the hills that IPv6 is the "solution" for today's
internet access)

Do you have IPv6 deployed and available to your entire customer base, so
that those who want to use it can do so? To my way of thinking, CGNAT is
probably going to be the number one driver of IPv6 adoption amongst the
broad customer base, *as long as their ISP provides it*.

3. 99.99% of customers don't notice they are transiting CGNAT, it just
works.

More precisely: you don't hear from 99.99% of customers, regardless of
whether or not they notice problems that are caused by CGNAT. People put up
with some *really* bad stuff sometimes without mentioning it to their
service provider.

5. NAT translation timeouts are important, XBOX and PlayStation suck.

Do they suck, or do they just not misbehave in a way that plays nicely
with your CGNAT?

10. It is not uncommon for people who run some game servers and websites
(like banks) to be completely clueless/confused about cgnat and randomly
block IP's as large numbers of users connect from single IP. This is not a
big issue in practice.

Is this cluelessness, or just reacting to a usage pattern which
overwhelmingly screams "abuse" that your CGNAT happens to emulate? From my
experience, I've blocked a lot more abusive sources than NATs by blocking
IPs that originate a lot of connections with varying UAs, for example. If
you walk like a duck and quack like a duck, it isn't only clueless people
who will call you a duck.

- Matt

Robert_Drake · July 29, 2014, 10:58pm

The timestamp is a natural index. You shouldn't need to run a distributed query for finding information about a specific incident. You would have to write your own custom tools to access and manage the db, so that's just impractical. The timestamp as well as most of the other fields should be fairly easily compressible since most of the bits are the same. You might as well use a regular plaintext logfile and gzip it.

Mark_Andrews2 · July 29, 2014, 11:13pm

> 2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
> thing, perhaps one day, but certainly not today (I really hate clueless
> people who shout to the hills that IPv6 is the "solution" for today's
> internet access)

Do you have IPv6 deployed and available to your entire customer base, so
that those who want to use it can do so? To my way of thinking, CGNAT is
probably going to be the number one driver of IPv6 adoption amongst the
broad customer base, *as long as their ISP provides it*.

Add to that over half your traffic will switch to IPv6 as long as
the customer has a IPv6 capable CPE. That's a lot less logging you
need to do from day 1.

> 3. 99.99% of customers don't notice they are transiting CGNAT, it just
> works.

More precisely: you don't hear from 99.99% of customers, regardless of
whether or not they notice problems that are caused by CGNAT. People put up
with some *really* bad stuff sometimes without mentioning it to their
service provider.

Like modems that introduce 2 second queuing delays the moment you
have a upstream transfer like a icloud backup. Buffer @!#$!@#$!
bloat!

Tony_Wicks · July 29, 2014, 11:23pm

3. 99.99% of customers don't notice they are transiting CGNAT, it just
works.

Surprised it's that high.

So was I to be honest, but in general "It Just Works".

4. You need to log NAT translations for LI purposes. (IP
source/destination, Port source/destination, time) Surprisingly this
does not produce that big a database burden. However as Cisco's Netflow
NAT logging is utterly useless you need to use syslog and this ramps up
the ASR CPU a bit.

Can you quantify?
The log entry has to be at least:
32 bits source address
16 bits source port
32 bits destination address
16 bits destination port
64 bits? timestamp

The issue with the Cisco NAT Translation flow is that as soon as you set the
nat mode to CGN it no longer sends the Pre Nat IP (100.64.x.x), which makes
it useless for matching against radius to identify the user. Several weeks
of arguing with TAC engineers got nowhere. TAC said, no that can't be done,
but could not explain why it worked fine with syslog translation logging.

Mark_Andrews2 · July 29, 2014, 11:53pm

Actually they are becoming much more common and the additional cost
is not that much, basically the cost of the better WiFi radios. If
you make IPv6 available and recommend that people buy a IPv6 capable
router next time they upgrade they will switch over. You won't
find IPv6 in 802.11[bg] only routers but it is in the ones with
newer WiFi radios.

e.g. NETGEAR WNDR3800 N600 is AUD$80 [mwave.com.au] + shipping and
supports IPv6.

The price point has come down dramatically from several years ago.

Mark

Owen_DeLong · July 30, 2014, 4:52am

Sure, but I didn’t ask the question of the general public… I asked it of the people on this list.

I suspect most of the membership of this list would opt out of CGN one way or another.

In my case, my provider is IPv6 capable and I’d simply move my tunnels from IPv4 to IPv6 rather than subject myself to CGN if necessary.

Owen

Owen_DeLong · July 30, 2014, 5:22am

2. IPv6 is nice (dual stack) but the internet without IPv4 is not a viable
thing, perhaps one day, but certainly not today (I really hate clueless
people who shout to the hills that IPv6 is the "solution" for today's
internet access)

Do you have IPv6 deployed and available to your entire customer base, so
that those who want to use it can do so? To my way of thinking, CGNAT is
probably going to be the number one driver of IPv6 adoption amongst the
broad customer base, *as long as their ISP provides it*.

Add to that over half your traffic will switch to IPv6 as long as
the customer has a IPv6 capable CPE. That's a lot less logging you
need to do from day 1.

That would be nice, but I’m not 100% convinced that it is true.

Though it will be an increasing percentage over time.

Definitely a good way of reducing the load on your CGN, with the additional benefit
that your network is part of the solution rather than part of the problem.

3. 99.99% of customers don't notice they are transiting CGNAT, it just
works.

More precisely: you don't hear from 99.99% of customers, regardless of
whether or not they notice problems that are caused by CGNAT. People put up
with some *really* bad stuff sometimes without mentioning it to their
service provider.

Like modems that introduce 2 second queuing delays the moment you
have a upstream transfer like a icloud backup. Buffer @!#$!@#$!
bloat!

Among other things.

99.99% of customers don’t now how to isolate the fault of such a thing to their ISP or how to properly complain about it in my experience. For the 0.01% who do, 99% of them don’t know how to get past the ISP’s first-line “let’s reboot your modem and when you call back afterwards, you won’t be my problem any more”.

Owen

Julien_Goodwin · July 30, 2014, 5:42am

Being on the content provider side I don't know the actual percentages
in practice, but in the NANOG region you've got Google/Youtube, NetFlix,
Akamai & Facebook all having a significant amount of their services v6
native.

I'd be very surprised if these four together weren't a majority of any
consumer-facing network's traffic in peak times.

Gary_Buhrmaster · July 30, 2014, 5:53am

.....

Add to that over half your traffic will switch to IPv6 as long as
the customer has a IPv6 capable CPE. That's a lot less logging you
need to do from day 1.

That would be nice, but I’m not 100% convinced that it is true.

For the 99.99% of the users who believe that facebook and twitter
*are* the internet, at least facebook is IPv6 enabled. 50.00%(*)!

Yes, I think we can all stipulate that those participating
on this list are different, and have different expectations,
and different capabilities, than those other 99.99%.

Gary

(*) If we are going to make up statistics, four significant
digits looks better than one.

Mark_Andrews2 · July 30, 2014, 6:56am

Enable IPv6 at home and measure the traffic. I did, which is why
I say > 50%.

Mark

Owen_DeLong · July 30, 2014, 3:45pm

The only actual residential data I can offer is my own. I am fully dual stack and about 40% of my traffic is IPv6. I am a netflix subscriber, but also an amazon prime member.

I will say that if amazon would get off the dime and support IPv6, it would make a significant difference.

Other than amazon and my financial institutions and Kaiser, living without IPv4 wouldn't actually pose a hardship as near as I can tell from my day without v4 experiment on June 6.

I know Kaiser is working on it. Amazon apparently recently hired Yuri Rich to work on their issues. So that would leave my financial institutions.

I think we are probably less than 5 years from residential IPv4 becoming a service that carries a surcharge, if available.

Owen

Corey_Touchet · July 30, 2014, 4:09pm

There¹s still a lot of websites that are not with the times.

No ipv6 on CNN, FOX, or NBC news websites.

Slashdot.org shame on you!

Comcast and AT&T work, but not Verizon. No surprise there. Power company
nope.

I think CGN is fine for 99% of customers out there. Until the iPhone came
out Verizon Wireless had natted all their blackberry customers and saved
million¹s of IP¹s. Then Apple and Google blew a hole into that plan.

Then again I¹m for IPv4 just running out and finally pushing people to
adopt. The US Govt has done a better job of moving to IPv6 than private
industry which frankly is amazing all things considered.

Comcast is pushing over 1TBPS of IPv6 traffic, but I¹m sure that¹s mainly
video from Youtube and Netflix.

Chris_Adams2 · July 30, 2014, 4:16pm

Once upon a time, Corey Touchet <corey.touchet@corp.totalserversolutions.com> said:

Comcast is pushing over 1TBPS of IPv6 traffic, but I�m sure that�s mainly
video from Youtube and Netflix.

One thing to remember about the video services that do support IPv6 is
that a lot of end users, even if they have IPv6 in the home, won't see
them over IPv6. Many people watch Netflix and such from TV-connected
devices like DVD/Blu-Ray players, "smart" TVs, Xboxes, TiVos, etc. Many
(most?) of these devices don't support IPv6, and many never will
(because they don't get firmware updates much after release).

TJ1 · July 30, 2014, 4:35pm

And Yurie recently posted an opening for an IPv6 Engineer at same ... for
any so inclined.

/TJ

Doug_Barton2 · July 30, 2014, 4:43pm

In the game console market, from what I could see from some quick searches, Xbox and Wii do v6, but PS4 does not. And as time goes on more things will do v6, not less.

The time for using "$FOO does not support IPv6, so I don't have to enable it" as an excuse is way past over.

Doug

Baker_Fred · July 30, 2014, 4:44pm

Per Microsoft public statements, they are now moving address space allocated them in Brazil to the US to fill a major service shortfall in Azure. They’re not the only kids on the block with that problem, but are perhaps the one most publicly reported. To my way of thinking, having services like that adopt IPv6 and tell their customers that they need to access the service using IPv6 would go a lot farther that residential service in pushing enterprise adoption.

http://tools.ietf.org/html/draft-anderson-siit-dc gives a fairly clever way to make it possible for the service itself to be IPv6-only and yet provide IPv4 access, and preserve IPv4 addresses in the process.