Hello everyone,
Trying to get my head around a certain unexpected behaviour.
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP.
The redundancy here is taken care of by the OSPF running via FRR on both ends.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools. The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends). As long as both ends prefer link1 or link2, it works fine. At first, I thought it had to do something with NAT but still can’t understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up. Any idea why asymmetric packets are being dropped here?
This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Thanks.
Could it be as simple as a stateful firewall?
My guess is something is doing stateful filtering. If you send a SYN down one link and the SYN-ACK comes back a different link, the receiving firewall will discard it as bogus. You should be able to test this by doing pcaps to confirm the traffic is arriving (though I’m not familiar with WireGuard so maybe not), and you should be able to disable this by setting a rule or unchecking a box in your firewall.
Linux by default (regardless of firewall rules) will not accept a packet on an interface when the source of that packet “should” be on another interface according to the current route table (in other words, you’re doing asymetric routing).
Easy fix:
Controls source route verification
net.ipv4.conf.default.rp_filter = 0
Do not accept source routing
net.ipv4.conf.default.accept_source_route = 1
Hello everyone,
Hi,
I am running two site to site VPNs (wireguard now, OpenVPN earlier) between my home and a remote server over two different WAN links. Both WAN links are just consumer connections - one with public IP and one with CGNATed IP.
Okay.
Is there any filtering of the traffic that flows through the VPNs? Or do things have full connectivity through them?
What OS is on each of the VPN endpoints?
The redundancy here is taken care of by the OSPF running via FRR on both ends.
Okay.
The unexpected behaviour I get is that if I set OSPF cost to prefer say link1 between home -> server and prefer link 2 between server -> home then connectivity completely breaks between the routed pools.
O.o
The point to point IPs stay reachable (which is over expected links i.e symmetric via both ends).
Please clarify if those IPs are inside the VPN or outside the VPN?
As long as both ends prefer link1 or link2, it works fine.
Okay.
At first, I thought it had to do something with NAT but still can't understand how. Since VPN tunnels have a keep-alive timer (for 10 seconds), the tunnel is always up.
Is NAT or SPI being applied to the traffic flowing through the VPN?
Any idea why asymmetric packets are being dropped here?
Not enough data to speculate yet.
This exact behaviour was in case of earlier OpenVPN + bird + iBGP and is still the same when I moved everything to Wireguard for VPN + FRR for routing + OSPF.
Can I ask why the change of the VPN technology, routing daemon, and protocol all at the same time? Or was that a diagnostic step?
Hi
I did disable firewall at both ends to test and the result was similar. Please note firewall rules do allow the UDP ports to establish the VPN link and inside the link, there aren’t any firewall restrictions.
However, as I said I wonder if or if not the CGNAT device of my link 2 will allow the inbound traffic on the established link.
This is probably enabled on one or both ends:
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.rpf.html
Do some distros enable this now?
I thought it was disabled by default.
Disable it.
Or make sure it's using loose (2) filtering.
rp_filter - INTEGER
0 - No source validation.
1 - Strict mode as defined in RFC3704 Strict Reverse Path Each incoming packet is tested against the FIB and if the interface is not the best reverse path the packet check will fail. By default failed packets are discarded.
2 - Loose mode as defined in RFC3704 Loose Reverse Path Each incoming packet's source address is also tested against the FIB and if the source address is not reachable via any interface the packet check will fail.
The issue is resolved by tweaking the route validation.
Added following my ansible playbook for both ends:
-
name: Enable Controls source route verification
sysctl:
name: net.ipv4.conf.default.rp_filter
value: ‘0’
sysctl_set: yes
-
name: Do not accept source routing
sysctl:
name: net.ipv4.conf.default.accept_source_route
value: ‘1’
sysctl_set: yes
and it works fien now.
Thanks, everyone for the inputs.