tunnel PMTUD with mss adjustment

Joe_Maimon1 · July 13, 2004, 12:10pm

Hello All,

I have been talking to "Company C' Tac trying to understand if this is a problem.

(
For reference to some things mentioned here see

)
1) C has a command to adjust the tcp mss option downward on packets that traverse an interface.
2) C has a command to set the ip mtu on an interface
3) C has a command that enables a IPSEC/GRE tunnel to conduct PMTUD on its path (by copying the df bit from encapsulated packet into the resulting packet)

I had been trying to convince TAC that 1 and 3 might not work properly together and that is a problem.

I gave them this scenario.

Host A ================= HOST D

MTU 1500

Router A
( || MTU 1492 )
( ISP A ) IPSEC/GRE Tunnel A
( || MTU 1492 ) Initial MTU 1432
(ISP B )
( || MTU 1492)
Router B

Router C ============= HOST C (PMTUD works)

Router D

Firewall A (Breaks PMTUD)

Host B

Router A and Router B are configured for

int tunnel 0
tunnel path-mtu-discover
! Physical MTU (pppoe) - GRE - IPSEC transport mode
ip mtu 1432
ip tcp adjust-mss 1392

Now lets assume that ISP B lowers mtu between ISP A to 1476 bytes and TUNNEL A detects this and both ROUTER A and ROUTER B lowers its tunnel mtu during an exchange of packets between HOST D and HOST C (which are configured for PMTUD and have the df bit set).

Now the tunnel mtu is effectively 1416.

When HOST A send a packet to HOST B with a mss-adjusted option of 1392, and HOST B sends an IP packet of length 1432 back to HOST A and Router B drops the packet (because it has DF set since HOST B is configured to do PMTUD and the packet is 16 bytes larger than the current tunnel mtu) and sends an ICMP unreachable which gets blocked by FIREWALL A, HOST A will find itself unable to communicate with HOST B because of a PMTUD blackhole.

SO in this scenario the ip tcp adjust-mss fails to achieve its stated goal of miniming PMTUD blackholes by aggresively seeking to limit the PMTU to a known interface mtu size. What would be reasonable to expect is that the tunnel layer would inform the mss-adjust layer that the original assumption of interface mtu is no longer valid and behave accordingly.

Had the adjustment of the MSS option in the packet from HOST A to HOST B taken into account the now 16 bytes lower tunnel mtu, and adjusted to 1376 instead of 1392, the packet from HOST B would have been sized at 1416 and would have traversed (hopefully) to HOST A safely.

At this point I am just a tad confused, so I was wondering if any NANOGers had some light they could shed on this.

Thanks,
Joe