[Story] When IPv6 Fixes IPv4 Peering Issues

Hi all,

So fun story for you all, and a good lesson as to why spending the time to set up IPv6 can save your ass in a pinch.

The players in this story are Me (and the company I consult with for when they have problems like this), Comcast (gig biz fiber), and CenturyLink (1/4th gig biz fiber).

Now, some of you who have Comcast and/or CenturyLink/Lumen probably remember issues last year regarding IPSec traffic getting heavily fouled up at peering points somewhere. And, if you were like me, you probably remember that it was, well, lets be honest, impossible to get it looked into or dealt with (in reality).

We resolved the issue ourselves between the three offices by switching to WireGuard which magically made the problems go away.

Things have been going great until last week, when we noticed one of our WireGuard peers between CL/Lumen in Cheyenne and Comcast Denver was down. Packets from Den -> Cys were going through, but not Cys -> Den. Cys -> Boise on CL was still working perfectly fine and was acting as a backup connection to the Den office.

I did my usual testing - changed ports, same behavior, changed IPs on the WireGuard endpoints on each end, same behavior. Even temp changed destination of the tunnel on Cys end to another off network node, and packets were going through, so we knew it had to be something relating going CL/Lumen -> Comcast.

Weird thing was, I could dump iperf udp traffic over the same ports from same devices Cys -> Den, and the packets would go through perfectly fine... So.. sounds like there's some sort of throttling or IDS in the way somewhere toying with things.

As expected, our first dealing with Comcast was less than spectacular where the tech tried to tell us that the live IPs they had assigned us, because they were a /27, they wouldn't work for VPN traffic (what?). I had to walk away from that call and let my partner finish it.

We went to dinner, and as we were returning home and pulling into the driveway, I remembered we had 'wasted' (as some of you would put it) a bunch of time setting up IPv6 on the outward facing devices at each office... including the WireGuard boxes.

I quickly reconfigured the Cys WireGuard node to connect to the Den node over IPv6 and, after WireGuard did its magic dynamically reconfiguring endpoints, suddenly the connection was back up and routing at full speed. Hell yeah!

So, moral / TLDR of the story?

Don't discount taking the time to set up IPv6, even if it's just for your important devices. Also, WireGuard > IPsec.

-- Brie

This is great to hear. I know that many things will operate better in IPv6 land vs IPv4 land as in IPv4 land there’s a lot of port-based filtering that happens in networks which isn’t the same in IPv6-land.

Super glad to hear that IPv6 saved the day for you!

- Jared

Curious to know if the ‘saved the day’ was really ‘fell into a different rate-limit bucket’ for UDP by address family :slight_smile:

Who knows... Hard to know if I'm taking a different network path or just going through the same routers but bypassing whatever is blocking the packets.

I should probably clarify that I'm not having slowdowns Cys -> Den. I'm having complete and total loss of packets for that one stream/type of traffic.

I went so far as to even randomize source/dest ports on each end in case it was some sort of misguided filtering (ie: port 12345 -> 54321 instead of 12345 -> 12345).

I have working tunnels on CL/Lumen's own network from Cys to multiple endpoints in Boi, so I know its not something on our side in Cys.

Who knows... Hard to know if I'm taking a different network path or just going through the same routers but bypassing whatever is blocking the packets.

You might be able to infer that from the hops that show up in traceroutes.

I went so far as to even randomize source/dest ports on each end in case it was some sort of misguided filtering (ie: port 12345 -> 54321 instead of 12345 -> 12345).

That really makes it odd. If you tried changing the src/dst ports with v4, and your VPN traffic still would not pass, but iperf would, what kind of filtering/breakage could CL or Comcast have that would just stop your VPN traffic regardless of ports?

I wonder if this is a misguided/misbhaving filter rule, or something actually broken eating your packets. Having dealt with something far stranger many years ago, I totally get the aspects of "big telco can't help / won't let you even talk to someone who can understand your explanation of the issue" and "I don't care how I solve this, as long as I can make the issue go away."

so v6 is the new oob. love it. actually, have actually used it as
such. saved the day. of course, using v4 trnsport to help debug v6 is
far more common.

This is great to hear. I know that many things will operate better in
IPv6 land vs IPv4 land as in IPv4 land there’s a lot of port-based
filtering that happens in networks which isn’t the same in IPv6-land.

uh, is this good news? seeing that v6 often goes directly through to
end hosts, as opposed to being natted, (think soho and home cpe), isn't
port checking even more desirable?

randy

Who knows... Hard to know if I'm taking a different network path or just going through the same routers but bypassing whatever is blocking the packets.

You might be able to infer that from the hops that show up in traceroutes.

So, I was looking at that, and the lack of rdns on many IPv6 hops is... aggravating. CL/Lumen does embed ipv4 addresses in some of their IPv6 router addresses, but not all.

So, not really enough to determine really if I am, unfortunately.

I went so far as to even randomize source/dest ports on each end in case it was some sort of misguided filtering (ie: port 12345 -> 54321 instead of 12345 -> 12345).

That really makes it odd. If you tried changing the src/dst ports with v4, and your VPN traffic still would not pass, but iperf would, what kind of filtering/breakage could CL or Comcast have that would just stop your VPN traffic regardless of ports?

I wonder if this is a misguided/misbhaving filter rule, or something actually broken eating your packets. Having dealt with something far stranger many years ago, I totally get the aspects of "big telco can't help / won't let you even talk to someone who can understand your explanation of the issue" and "I don't care how I solve this, as long as I can make the issue go away."

One thing I did not think about before reading your response, was the size of the UDP packets. Its possible something's blocking UDP packets below a certain size.

iPerf is going to send packets of a consistent size for testing, whereas WG is likely to have varying sizes depending on contents of packets.

Maybe I need to do some testing with varying packet sizes.