Use of NPTv6 in a mobile service provider network

Amos_Rosenboim · February 2, 2025, 6:24pm

Hi,

We are implementing an CGNAT + IPv6 firewall project for a mobile service provider.

One of the project goals is to support scale out all active deployment of the stateful devices.

One of the challenges of inserting these stateful devices into the network is the requirement that all packets of the same flow will be routed through the same device, while maintaining multi homing of the stateful device.

There are few ways to achieve this in the network, but there is also an option to work around this requirement by using NPTv6 on each device or even NAPT66 on each device.

I’m trying to understand if this option is deployed anywhere.

I’m trying to get feedback on possible technical issues with this approach.

Please no “NAT is bad and should be avoided with IPv6” argument, but if you have solid technical objections I’m very interested.

Cheers,

Amos

Joshua_Miller · February 3, 2025, 3:46am

Hi Amos,

Assuming the network segments adjacent to these stateful devices use longest prefix match routing, NPTv6 is your best option.You’d assign a unique IPv6 prefix as the NPTv6 prefix to each firewall, ensuring the traffic returns to the correct firewall.

Keep in mind each stateful firewall is a single point of failure for the flows it handles. When it inevitably goes down ( maintenance or failure), all those flows will have to be re-established through other firewalls. Also, depending on how the clients are configured with connection timeouts, the users could experience a noticeable amount of service disruption.

It’s possible to have firewalls in a cluster sharing state, but I consider them to be a single logical device with its own failure profile. In that scenario I would be inclined to deploy multiple redundant clusters; without knowing your budget I don’t know how feasible this is. —“Shared state, shared fate.”

I wouldn’t use NAPT66 unless you need to do something really bespoke. Introducing port translation complicates end-to-end connectivity, and adds more latency and issues for applications like VoIP.

To dive a little deeper, I’d reevaluate the requirement for the firewalls to be stateful. Are there any specific threats or attack vectors you want to address with stateful flow tracking?

Best,
Josh

Amos_Rosenboim · February 3, 2025, 10:03am

Thank you Joshua for the quick and detailed response.

I agree with everything you mentioned below, and this is why we are considering it.

To your questions and comments below:

The requirement for state full traffic flow is given by the customer.
The logic behind it is to avoid unnecessary paging procedures for idle mobile devices.
It protects both signaling resources of the network and also battery life of devices.
This was very relevant in the early 2000s, not sure if it’s relevant for today.
However it remains a customer requirement.

As for clients recovery from flow interruption - from incidents we had in the last few years and observing how fast connection ramp up on the alternate devices it seems that clients are recovering very quickly.

My main concern is that this customer has pretty traditional mind set and never like being the first deployment of any technology.

This is why I am looking for inputs on other deployments that use this technology.

Regards,

Amos

Dobbins_Roland1 · February 3, 2025, 12:41pm

The requirement for state full traffic flow is given by the customer.

Organizations sometimes state that they’ve requirements in specializesd contexts which are in fact counterproductive; in such cases, they can often benefit from education in order to make contextually optimal decisions.

The logic behind it is to avoid unnecessary paging procedures for idle mobile devices.

‘Paging procedures’?

It protects both signaling resources of the network and also battery life of devices.

There are other ways to accomplish this.

This was very relevant in the early 2000s, not sure if it’s relevant for today.

It was a huge mistake in the late 1990s and early 2000s, as the early GPRS and EDGE wireless broadband networks which were implemented in the same fashion as poorly-designed, state-ridden enterprise networks constantly experienced severe operational problems until they were remediated, one way or another.

However it remains a customer requirement.

See above.

As for clients recovery from flow interruption - from incidents we had in the last few years and observing how fast connection ramp up on the alternate devices it seems that clients are recovering very quickly.

Introducing stateful firewalls in front of a population of Internet broadband clients is a Very Bad Idea. DDoS attacks are attacks agains capacity and/or state; and outbound/crossbound attacks can be just as disruptive as inbound attacks.

This precise scenario has played out many times, over the years. Networks which were suboptimally designed in this fashion were either completely re-designed in order to be scalable and resilient, removing unnecessary and harmful state; were acquired and their brittle, fragile, non-scalable state-ridden infrastructure was decommissioned; or went out of business.

The few holdouts in the present day inevitably experience the problems described above, and then proceed through the same evolution as other network operators with similar architectures.

My main concern is that this customer has pretty traditional mind set and never like being the first deployment of any technology.

NAT64/DNS64 with 464XLAT or something along these lines isn’t new technology; on the contrary, it’s quite mature, and deployed around the world. It isn’t stateless, but it’s much more scalable than sticking stateful firewalls everywhere, heh.

Designing and implementing a broadband access network with this sort of architecture isn’t going to end well. It isn’t beyond the realm of possibility that these ‘requirements’ are largely driven by a supplier of stateful firewalls, or an internal advocate for same.

Amos_Rosenboim · February 3, 2025, 8:14pm

Roland,

Thanks for your comments.

As much as I love to be a network purist who hates state maintenance in the core of the network, the sad reality is that these devices are there and will remain there for the foreseeable future.

Mobile operators need IPv4 address sharing and many of them choose to do it with CGNAT.

Even with IPv6, many of the operators I know of do not allow internet initiated traffic towards their subscribers.
Some of their reasons are even surprisingly valid, such as avoiding unnecessary paging in the network.

Regardless of this, my original message as looking to get some deployment feedback on NPTv6 in service provider networks.
Any such feedback is appreciated.

Cheers,

Amos

Ca_By · February 3, 2025, 8:21pm

Roland,

Thanks for your comments.

As much as I love to be a network purist who hates state maintenance in the core of the network, the sad reality is that these devices are there and will remain there for the foreseeable future.

Mobile operators need IPv4 address sharing and many of them choose to do it with CGNAT.

Even with IPv6, many of the operators I know of do not allow internet initiated traffic towards their subscribers.
Some of their reasons are even surprisingly valid, such as avoiding unnecessary paging in the network.

Regardless of this, my original message as looking to get some deployment feedback on NPTv6 in service provider networks.
Any such feedback is appreciated.

I do not think any service providers have deployed NPTv6 at any scale.

That’s my feedback. You are building something quite bespoke , not best practice, and should anticipate novel problems.

Dobbins_Roland1 · February 3, 2025, 9:18pm

As much as I love to be a network purist who hates state maintenance in the core of the network, the sad reality is that these devices are there and will remain there for the foreseeable future.

Not on reliable, resilient networks of any significance, they aren’t.

Network operators who deploy them end up removing them, for the reasons previously described. This isn’t an abstract techno-philosophical stance; I’ve seen this happen repeatedly, after significant network outages which resulted from poor design choices.

Mobile operators need IPv4 address sharing

The way to accomplish this is with NAT64/DNS64 with 464XLAT. This approach is used by some of the largest wireless network operators in the world; if it’s good enough for them, it’s good enough for your customer.

NPTv6 is not a viable alternative for mobile operators because it disrupts end-to-end IPv6 connectivity, which can cause problems with IPSEC and the like; lacks a built-in IPv4 transition mechanism; and has a significant negative impact on the stability and resiliency of the network. NPTv6 shouldn’t exist; and to the degree that it’s even remotely suitable for any network at all, it’s only for small enterprise endpoint networks which exercise a substantial degree of administrative control over the communications of the nodes on said networks.

Glenn_McGurrin · February 3, 2025, 9:24pm

I feel like you are conflating two things, stateful firewalls and NPTv6 or any form of NAT, they are often done at the same box together, but they are not inherently linked.

I dislike NAT in an IPv6 environment as I've generally not found a use for it not better served by something else, but also IPv6 things are not used to NAT being used, I'd expect much more breakage given that most IPv6 stacks are likely not well tested in the presence of NAT. IPv4 things have learned how to accept ans handle NAT and some of that I'm sure carries over to IPv6, but in IPv6 it's very much an edge case, where it's the norm in IPv4.

I do understand the desire for a stateful firewall in the IPv6 context and see them deployed at home/business/enterprise network edge pretty much for every IPv6 enabled network, often with a carve-out for icmp, but otherwise blocking all inbound traffic not matching an existing connection initiated by the device behind the firewall. You may have some reason you can't just employ a stateful firewall without NAT, but if so you haven't said so and seem to have linked them as if they were inseparable, a stateful firewall will block internet initiated traffic, which seems to be the main goal you have, and it will not have the negative side effects of NAT, though you do have the need to force symmetrical routing at the point of the firewall and carry state (though as long as routing symmetry is maintained a user's traffic can freely use multiple firewalls for different traffic, say one for traffic headed towards a content provider cache box and a different one for traffic heading onto the general public internet, which may help with some scaling considerations)

Aaron · February 3, 2025, 10:14pm

My CGNat domains for resi bb (dsl, cm, ftth) for IPv4 were created years ago as MPLS-based L3VPN's. I've tested and proven an architecture where by which, I advertise another BGP RT and allow the IPv6 dual stacked portion to "flow around" the CGNat boundary and naturally route out to the Internet, un-natted...the way God intended IP end-to-end communications to be.

-Aaron

Brandon_Martin · February 3, 2025, 11:39pm

Even with IPv6, many of the operators I know of do not allow internet initiated traffic towards their subscribers.

Address translation is not required for this function. A stateless ACL can do a lot to limit it especially combined with assigning IPv6 addresses that are not easily guessed or otherwise probed (i.e. use all of that entropy in the least significant 64-bits of the address to your advantage).

If you must fully inhibit all unsolicited inbound traffic including that which could, upon stateless inspection, be part of a valid flow, a stateful filter at the appropriate point can accomplish this again without address translation. I don't know if any major mobile networks actually do this on IPv6. I can't imagine it's really necessary on IPv6.

Of course, address (and port) translation is a fact of life on consumer access networks for IPv4, these days. There are ways to make much of the stateful part live in places where failure will result in limited damage, and I'm fond of using them where possible.

I really don't see a compelling argument for NPTv6 based on what you've described, and all the usual arguments against it still apply.

avoiding unnecessary paging in the network.

While a laudable goal, is this really THAT big of a deal these days?

I'd wager the legitimate non-interactive traffic on a typical consumer mobile device (social media, instant messaging services, etc.) probably causes plenty of "paging" anyway and likely quite a bit more than you'd get from unsolicited traffic on a high-entropy IPv6 address with absolutely zero filtering. I was of the impression that modern LTE/5G-NR networks had lots of mitigations for the handset-side power implications of this, too.

Amos_Rosenboim · February 4, 2025, 5:25am

Thank you.
I am not building it yet… still considering it.

The functional problems I am considering are in the fields of ALG.

What other problems do you anticipate ?

Regards

Dobbins_Roland1 · February 4, 2025, 10:08am

All the issues mentioned earlier in this thread.

There are multiple techniques available to ameliorate the side-effects of aggressive scanning in a network using NAT64/DNS64 with 464XLAT.