Cheap LSN/CGN/NAT444 Solution

Hi all,

I am sure this is something that a reasonable number of people would have
done on this list.

I am after a LSN/CGN/NAT444 solution to put about 1000 Residential profile
NBN speeds (fastest 100/40) services behind.

I am looking at a Cisco ASR1001/2, pfSense and am willing to consider other
options, including open source.... Obviously the cheaper the better.

This solution is for v4 only, and needs to consider the profile of the
typical residential users. Any pitfalls would be helpful to know - as in
what will and and more importantly wont work - or any work-arounds which
may work.

This solution is not designed to be long lasting (maybe 6-9 months)... it
is to get the solution going for up to 1000 users, and once it reaches that
point then funds will be freed up to roll out a more robust, carrier-grade
and long term solution (which will include v6). So no criticism on not
doing v6 straight up please.

Happy for feedback off-list of any solutions that people have found work
well...

Note, I am in Australia so any vendors which aren't easily accessible down
here, won't be useful.

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

Hi all,

I am sure this is something that a reasonable number of people would have
done on this list.

I am after a LSN/CGN/NAT444 solution to put about 1000 Residential profile
NBN speeds (fastest 100/40) services behind.

I am looking at a Cisco ASR1001/2, pfSense and am willing to consider other
options, including open source.... Obviously the cheaper the better.

Total PPS or bandwidth is the number you need rather than number of customers. Assuming 1Gbps aggregation then almost anything will work for your requirements and support NAT. Obviously if you have a large number of 100Mbps customers then 1Gbps wouldn't cut it for aggregation.

Based on your looking at the ASR I would guess you're somewhere around 1Gbps, maybe 2Gbps. If you're closer to 1Gbps and want to stay with a 1RU solution then I would advise checking out the ASA5512 which is much cheaper than an ASR.

If you want to go ultra cheap but scalable to 4Gbps you could use a Cisco 6500/sup2/FWSM (all used.. probably totals less than $1000USD, but I don't know how much it is in Australia). That would let you replace parts later to move to SUP720/ASASM for around 16Gbps throughput.

FWIW, I doubt you'll find a NAT platform with no IPv6 support, so you can start your IPv6 work now if need be. Older stuff like the FWSM won't support things like DS-Lite though, so if you plan to go v6-only in your backbone then that's something to think about.

This solution is for v4 only, and needs to consider the profile of the
typical residential users. Any pitfalls would be helpful to know - as in
what will and and more importantly wont work - or any work-arounds which
may work.

This solution is not designed to be long lasting (maybe 6-9 months)... it
is to get the solution going for up to 1000 users, and once it reaches that
point then funds will be freed up to roll out a more robust, carrier-grade
and long term solution (which will include v6). So no criticism on not
doing v6 straight up please.

Be wary if someone thinks this is going to last 6-9 months. That's less than a funding cycle for a company and longer than an outage. That means the boss is pulling the number out of his ass and it could last anywhere from 30 days to 10 years depending on any number of factors.

Also, be sure you have S/RTBH or some other mechanism southbound of the NAT for dealing with compromised/abusive hosts which can chew up the state-table with SYN-floods and the like.

From experience (we ran out of IPv4 a long time ago in the APNIC region)

this is not needed, what is needed however is session timeouts. Xbox and
PlayStation are the most sensitive to session timeouts.

From experience (we ran out of IPv4 a long time ago in the APNIC region) this is not needed,

I've seen huge problems from compromised machines completely killing NATs from the southbound side.

what is needed however is session timeouts.

This can help, but it isn't a solution to the botted/abusive machine problem. They'll just keep right on pumping out packets and establishing new sessions, 'crowding out' legitimate users and filling up the state-table, maxing the CPU. Embryonic connection limits and all that stuff aren't enough, either.

I've seen huge problems from compromised machines completely killing
NATs from the southbound side.

It depends on CGN solution used. Some of them will just block new
translations for that user after reaching the limit, and that's it.

I am after a LSN/CGN/NAT444 solution to put about 1000 Residential
profile NBN speeds (fastest 100/40) services behind.

I am looking at a Cisco ASR1001/2, pfSense and am willing to consider
other options, including open source.... Obviously the cheaper the
better.

ASR1k NAT is known to be problematic (nat overload specifically), don't
know if they fixed it yet. I recommend to check this with the vendor first.

New Juniper MS-MIC/MS-MPC multiservices cards can be used but
feature-parity with MS-DPC isn't there yet. For example, you can have a
working CGN with most bells and whistles, but you can't use IDS. You can
(probably) use deterministic nat with max ports/sessions per user, but
sometimes it's not enough. Again, ask the vendor for
details/roadmaps/solutions.

Both those options aren't really cheap though.

Cheaper would be something like Mikrotik but I wouldn't touch that sh*t
with a ten-foot pole. It might work but you'll pay for that with your
sanity and sleep hours.

Speaking of cheap and open-source, I know several relatively large
implementations using Linux boxes. One Linux NAT box can chew on at
least 1Gb/s of traffic, or even more with a careful selection of
hardware and even more careful tuning, and you can load-balance between
them, but it's much more effort and it isn't robust enough (which is the
reason why they all migrate to better solutions later).

BTW, I agree that you should speak in PPS and bandwidth instead of
number of users, those are much better as a metric.

This solution is for v4 only, and needs to consider the profile of the
typical residential users. Any pitfalls would be helpful to know -
as in what will and and more importantly wont work - or any
work-arounds which may work.

Try to pair a user IP with a public IP, that way you'll workaround most
websites/games/applications expecting publicly visible user IP to be the
same for all connections.

Start with selected few active customers, check how much connections
they use with different NAT settings. Double/triple that. Then do the
math of how many ports/IPs you need per X users, don't just guess it.
Then try to limit it and see if anything breaks.

By working with them you can also workaround some of the problems you
didn't think about before. Seriously. Fix it before you roll it out.

What anyone implementing CGN should expect is complaints from users for
any number of reasons, like their IPSEC or L2TP tunnel stopped working,
or some application behaves strangely and so on. Prepare your
techsupport for that.

This solution is not designed to be long lasting (maybe 6-9
months)... it is to get the solution going for up to 1000 users, and
once it reaches that point then funds will be freed up to roll out a
more robust, carrier-grade and long term solution (which will include
v6). So no criticism on not doing v6 straight up please.

Heh. Nothing lasts longer than temporary solutions. You should implement
it like you're going to live it for years (probably true) or you'll
create yourself a huge PITA very soon.

Le 2014-06-30 06:12, Roland Dobbins a �crit :

what is needed however is session timeouts.

This can help, but it isn't a solution to the botted/abusive machine problem. They'll just keep right on pumping out packets and establishing new sessions, 'crowding out' legitimate users and filling up the state-table, maxing the CPU. Embryonic connection limits and all that stuff aren't enough, either.

Why? Cause that (per-subscriber limits on ports and memory) is exactly what we recommend in RFC 6888...

Simon

<https://app.box.com/s/a3oqqlgwe15j8svojvzl>

I can't tell you how many times I've received frantic 4AM calls about NATted wireless networks going down due to this sort of thing. It's a real problem.

Also, there are horizontal behaviors which are undesirable, as well.

Le 2014-06-30 09:05, Roland Dobbins a �crit :

Why? Cause that (per-subscriber limits on ports and memory) is exactly what we recommend in RFC 6888...

<https://app.box.com/s/a3oqqlgwe15j8svojvzl&gt;

I can't tell you how many times I've received frantic 4AM calls about NATted wireless networks going down due to this sort of thing. It's a real problem.

If you're saying "NAT is bad", then sure, ok, but that's besides the point.

Otherwise, then I don't know what your point is.

Oh, actually I think I get it. You're trying to sell something.

Also, there are horizontal behaviors which are undesirable, as well.

Yeah, and let's not forget the diagonal ones either.

Simon

Pitfall 1: Make sure you have enough support desk to handle calls from
everybody who's doing something that doesn't play nice with CGN/NAT444.
And remember that unless "screw you, find another provider" is an acceptable
response to a customer, those calls are going to be major resource sinks to
resolve to the customer's satisfaction...

Pitfall 2: These sort of short-term solutions often end up still in
use well after their sell-by date. If you're planning to deploy a
new solution in 6 months, maybe throwing resources at a short-term fix
is counterproductive and the resources should go towards making the current
solution hold together and deploying the long-term solution...

Yes, you've found me out - I'm 'selling' S/RTBH, which is built-in functionality of routers and layer-3 switches made by companies which don't employ me.

<http://tools.ietf.org/html/rfc5635>

I run ASR1k6's ESP40/RP2 with 10-15k BNG clients on each running full CGNAT.
Translations peak at about 250k per 10K users. The ESP40 can handle 2M
translations, so there is plenty of room to run them up to 32k users without
having to be concerned (64k in an emergency). I have been running this
configuration for 2+ years in production and never had any issue with
getting anywhere near close to having a performance issue. Now incoming DDOS
attacks are another matter, they are a lot more common and damaging with the
CGNAT as you need to remove the destination IP from your nat pool for the
duration.

If you were doing your CGNAT on an older 72xx or similar CPU based box, well
then all bets are off, I would expect available NAT table resource to be
very easy to exhaust.

--==_Exmh_1404135618_1958P
Content-Type: text/plain; charset=us-ascii

> I am after a LSN/CGN/NAT444 solution to put about 1000 Residential profile
> NBN speeds (fastest 100/40) services behind.

> This solution is for v4 only, and needs to consider the profile of the
> typical residential users. Any pitfalls would be helpful to know - as in
> what will and and more importantly wont work - or any work-arounds which
> may work.

Pitfall 1: Make sure you have enough support desk to handle calls from
everybody who's doing something that doesn't play nice with CGN/NAT444.
And remember that unless "screw you, find another provider" is an acceptable
response to a customer, those calls are going to be major resource sinks to
resolve to the customer's satisfaction...

And this is where the entire industry world wide is to blame. CGN,
DS-Lite, NAT64 are designed as end-of-transition products not start-
of-transition products. They are designed around getting to a
legacy IPv4 network. CGN, DS-Lite, and NAT64 all reduce functionality
that is normally available on wired networks.

Just because there was not a fixed date, like 1/1/2000, for when it
would be too late didn't mean that there wasn't a problem coming
or that plain dual stack shouldn't have been ubiquitous before then.

As a consumer I don't want to be forced to loose functionally because
the industry as a whole was too f!@$!#!@ short sighted to do what
was best for the consumer well enough in advance so that everyone
could sort out the teething issues. Networks work because *everybody*
can speak the same protocol. I don't care which transport protocol
I use. I do care if I can't continue to do something because people
were too slow to react.

Hi Rob,

Interesting insights. I hadn't thought of an older 6500/7600... certainly
might be worth considering if I want to stay Cisco.

Yes, PPS is the key, but I thought someone might have some comments on the
metrics/pps I'd expect with that kind of user profile and speeds.

It doesn't need to not have v6, I'm just not using it at the moment.

The timeframes are my numbers based on the proof of concept for the larger
business model/design - which is modular as such.

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

Roland, as always you remind me of the important things to remember.

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

Roland, what methods are the easiest/cheapest way to deal with this?

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

Hi Valdis,

Re 1.. completely understand. The environment is such that we will openly
state what does and doesn't work. It is a captive environment and the
users don't have a choice who they use. Think large university dorm (about
600) for part of the customer base.

Re 2.. The larger design is already approved and budgeted for... this is a
proof-of-concept cheap solution to see if the uptake happens as expensive.
I agree with you that we should just build it the right was the first
time, but the people paying want to do it this way. And in the end, I am
just the designer, if they leave it in place, it is not really my concern,
they have my advice.

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

With enough horsepower, iptables+Linux is adequate for this, depending on your
requirements.

I would want to put as little money as possible behind CGN in favor of moving as
much as possible towards IPv6 instead.

Owen

Great advice Stepan.

Re user support. It is a greenfield environment so we're in the position
to say 'this is how it is and what you get'.

Re usage profile. No idea what to expect from users as there is nothing to
measure. I've actually not designed a NAT444 solution for residential
profiles before so never had to worry about what they did.

...Skeeve

*Skeeve Stevens - *eintellego Networks Pty Ltd
skeeve@eintellegonetworks.com ; www.eintellegonetworks.com

Phone: 1300 239 038; Cell +61 (0)414 753 383 ; skype://skeeve

facebook.com/eintellegonetworks ; <http://twitter.com/networkceoau>
linkedin.com/in/skeeve

experts360: https://expert360.com/profile/d54a9

twitter.com/theispguy ; blog: www.theispguy.com

The Experts Who The Experts Call
Juniper - Cisco - Cloud - Consulting - IPv4 Brokering

Greenfield or not, unless you can expect that 100% of the users have never
had internet access anywhere else before, you may be up against expectations
you are not meeting with NAT444.

Owen