CGNAT Solutions

Afternoon,

I run a small ISP in Tennessee. COVID has forced a lot of people to work from home. I am starting to run low on IP’s and need to consider CGNAT.

I do have IPV6 space, but we all know that until we force everyone to move to IPV6, we need to keep IPV4 up and running.

I could buy more space, but I am really wondering if that is the best option. It is expensive. I know CGNAT devices are expensive as well, but it looks like I could stretch it out a bit.

My thinking is to convert about 50% of my subscribers to CGNAT.

I am interested in vendors or devices you have used in the past. I already know about the pitfalls many of my subscribers will have with CGNAT such as VPN’s, Gamers, etc.

What are your thoughts on CGNAT vendors?

A10Networks
F5Networks
Others?

Just go with Linux and iptables. It is by far the cheapest option and it just works.

tir. 28. apr. 2020 21.13 skrev John Alcock <john@alcock.org>:

Hi John, I run a small/medium ISP in Texas. A few years ago, needing to do the same thing you are speaking of, I lab evaluated the Cisco ASR9k VSM-500 and Juniper MX104 MS-MIC-16G… in the end I went with Juniper. No regrets, been good and holding strong. I’ve scaled it way beyond what I originally envisioned. (but bought more as well)

I slow started my CGNat deployment, like with most things, baby-steps when doing something as extreme as taking away the public ip address from my isp residential customers… so yeah, slow-start…

DSL was my first target. One DSLAM at a time, waiting for issues to arise and dealing with them along the way, the best I could. …until we had 6,000 dsl customers behind a pair of Juniper MX104’s with MS-MIC-16G cards, running fine. (all done via mpls l3vpn for virtual L3 routing into and out of the nat boundary… so one vrf for inside, and one vrf for outside)…peak load as I recall was about 3 gbps on each MX104, so 6 gbps total.

Next, about a year or so later, we went after Cable Modem CMTS communities. But, added MS-MPC-128G modules to a pair of our mpls 100 gig ring MX960 nodes. This was another 5,000 subs or so. (this was about 2 or 3 years ago). Learned a lot during that one. A lot about ecmp, inet.3 mp-ibgp route choices, (set protocols ldp track-igp-metric… is your friend), app, eim, eif, ams/mams interfaces and load-balancing on the source-ip…. Let that ride for a year or so…then…

…went after our FTTH communities. Probably about 30 or 40 thousand ip’s were recoup’d here. FTTH was nat’d behind (4) additional MS-MPC-128G modules in (4) other 100 gig mpls ring mx960 nodes.

There have been recent concerns about uPNP not working behind the cgnat’s.

All in all, we are getting lots of use out of our Juniper CGNat solution. All told, it’s about 50,000 customers behind the (2) MX104’s and (6) MX960’s getting nat’d.

-Aaron

Hi John,

How small is small? Up to a certain size regular NAT with enough
logging to trace back abusers will tend to work fine. if we're talking
single-digit gbps, it may not be worth the effort to consider the
wonderful world of CGNAT.

Regards,
Bill Herrin

I will say it is much better to consider 464XLAT with NAT64, if the CPEs allow it.

https://datatracker.ietf.org/doc/rfc8683/

I’m right now doing a deployment for 25.000.000 customers of an ISP (GPON, DLS and cellular mix), all the testing has been done, and all doing fine.

I’ve done it already for smaller ISPs, but the size of this project is more interesting to better demonstrate that it just works.

I plan to do a presentation when the information can be made public … bit delay because the Covid-19 confinement.

Regards,

Jordi

@jordipalet

Take a look at DANOS for CG-NAT as a free solution or Netgate’s TNSR has a CG-NAT feature https://www.tnsr.com/features

Depending on how many IPs you need to reclaim and what your target IP:subscriber ratio is, you may be able to eliminate the need for a lot of logging by assigning a range of TCP/UDP ports to a single inside IP so that the TCP/UDP port number implies a specific subscriber.

You can't get rid of all the state tracking without also having the CPE know which ports to use (in which case you might as well use LW4o6 or MAP), but at least you can get it down to where you really only need to log (or block and dole out public IPs as needed) port-less protocols.

Brandon Martin wrote:

You can't get rid of all the state tracking without also having the CPEknow which ports to use

If you mean getting rid of logging, not necessarily. It is enough if
CPEs are statically allocated ranges of external port numbers.

            Masataka Ohta

Yes, you can get rid of the logging by statically allocating ranges of port numbers to a particular customer.

What I was referring to, though, was the programmatic state tracking of the {external IP, external port}-{internal IP, internal port} mappings. You can't eliminate that unless the CPE also knows what internal port range it's mapped to so that it restricts what range it uses. If you can do that, you can get rid of the programmatic state tracking entirely and just use static translations for TCP and UDP which, while nice, is impractical. You're about 95% of the way to LW4o6 or MAP at that point.

Brandon Martin wrote:

If you mean getting rid of logging, not necessarily. It is enough if
CPEs are statically allocated ranges of external port numbers.

Yes, you can get rid of the logging by statically allocating ranges of port numbers to a particular customer.

And, that was the original concern.

What I was referring to, though, was the programmatic state tracking of the {external IP, external port}-{internal IP, internal port} mappings.

OK.

You can't eliminate that unless the CPE also knows what internal port range it's mapped to so that it restricts what range it uses. If you can do that, you can get rid of the programmatic state tracking entirely and just use static translations for TCP and UDP which, while nice, is impractical. You're about 95% of the way to LW4o6 or MAP at that point.

Interesting. Then, if you can LW4o6 or MAP, you are about 95% of the
way to E2ENAT with complete end to end transparency using IPv4 only,
which means we don't need IPv6 with 4to6 NAT lacking the transparency.

  draft-ohta-e2e-nat-00

            Masataka Ohta

I'm wondering if there are any real world examples of this, namely in
the realm of subscriber to IP and range of ports required, etc. ie: Is
is a range of 1000 ports enough for one residential subscriber? How
about SMB where no global IP is required.

One would think a 1000 ports would be enough, but if you have a dozen
devices at home all browsing and doing various things, and with IOT,
etc, maybe not?

Brandon Martin wrote:

If you mean getting rid of logging, not necessarily. It is enough if
CPEs are statically allocated ranges of external port numbers.

Yes, you can get rid of the logging by statically allocating ranges of
port numbers to a particular customer.

And, that was the original concern.

What I was referring to, though, was the programmatic state tracking of
the {external IP, external port}-{internal IP, internal port} mappings.

OK.

You can’t eliminate that unless the CPE also knows what internal port
range it’s mapped to so that it restricts what range it uses. If you
can do that, you can get rid of the programmatic state tracking entirely
and just use static translations for TCP and UDP which, while nice, is
impractical. You’re about 95% of the way to LW4o6 or MAP at that point.

Interesting. Then, if you can LW4o6 or MAP, you are about 95% of the
way to E2ENAT with complete end to end transparency using IPv4 only,
which means we don’t need IPv6 with 4to6 NAT lacking the transparency.

https://tools.ietf.org/html/draft-ohta-e2e-nat-00

Masataka Ohta

Since we are talking numbers ans hard facts

42% of usa accesses google on ipv6

https://www.google.com/intl/en/ipv6/statistics.html

hey,

I'm wondering if there are any real world examples of this, namely in
the realm of subscriber to IP and range of ports required, etc. ie: Is
is a range of 1000 ports enough for one residential subscriber? How
about SMB where no global IP is required.

One would think a 1000 ports would be enough, but if you have a dozen
devices at home all browsing and doing various things, and with IOT,
etc, maybe not?

1000 ports doesn't mean you can have at max 1000 layer-4 sessions at once. It means you can have 1000 sessions to single destination IP+port. You can reuse same source port numbers for different destination IP or even destination port.

We are seeing very good results with 256 ports per subscriber in the mobile scenario where consumer is mobile handset. So not directly translatable to broadband setup but still good datapoint.

If you must go CGNAT today it's only reasonable to use PBA (so you log only block allocations) or pure deterministic where you have strict mapping between inside IP and outside IP+portrange so you don't need any logs at all.

How big is your ip pool for CGNAT?

https://www.juniper.net/documentation/en_US/junos/topics/concept/nat-best-practices.html

There are some numbers in there for instance talking about 1024 ports per subscriber as a good number. In presentations I have seen over time, people typically talk about 512-4096 as being a good number for the bulk port allocation size.

I haven’t used them, but 6-WIND is pretty proud of their CGNAT performance.

So as a happy medium of about 2048 ports per subscriber, that's roughly
a 32:1 NAT/IP over-subscription ?

Thank you everyone for the suggestions.

To clarify small ISP.

12K subscribers
35 Gigs traffic at peak.

Growing about 500 megs per month traffic.

John

Yes, around that.

In testing, I observed opening a website, for instance cnn.com can cause >200 ports/sessions to fire off. Although, many are short-lived sessions, but, ports requests nonetheless.

Overall, I use about 1,500 public ip's for 50,000 private ip customers

I allow 3,000 ports per customer ... 30 blocks of 100 each

We started our port blocks at a nice round number, so that each pba dynamic assignment results in nice 100-199, next 200-299 .... good for parsing, grep'ing logs for doing subpoena info look-ups, etc.

I see most customers hover well below 1,000 ports/sessions active, and what appear to be misbehaving hosts (malware, infected, bots, etc, unsure) hit up at the 3,000 max and trigger a ports exceeded error message. I see the 3k port limit as putting a cap on free-running suspicious hosts. We can then investigate and contact customer of the concern.

-Aaron