SLAAC in renumbering events

Fernando_Gont2 · March 8, 2019, 11:32am

Folks,

If you follow the 6man working group of the IETF you may have seen a
bunch of emails on this topic, on a thread resulting from an IETF
Internet-Draft we published with Jan Žorž about "Reaction of Stateless
Address Autoconfiguration (SLAAC) to Renumbering Events" (Available at:
https://github.com/fgont/draft-slaac-renum/raw/master/draft-gont-6man-slaac-renum-02.txt
)

Short version of story:

There are a number of scenarios where SLAAC hosts may end up using stale
configuration information.

For example, a typical IPv6 deployment scenario is that in which a CPE
router requests an IPv6 prefix to an ISP via DHCPv6-PD, and advertises a
sub-prefix of of the leased prefix on the LAN-side, via SLAAC. In such
scenarios, if the CPE router crashes and reboots, it may loose all
information about the previously-leased prefix. Upon reboot, the CPE
router may be leased a new prefix that will result in a new sub-prefix
being advertised on the LAN-side of the CPE router.

As a result, hosts will normally configure addresses for the
newly-advertised prefix, but will normally also keep (and use) the
previously-configured (and now stale!) IPv6 addresses, leading to
interoperability problems.

The RIPE-690 BCOP document had originally tried to address this problem
by recommending operators to lease stable IPv6 prefixes to CPE routers.
However, for a variety of reasons ISP may not be able (or may not want)
to lease stable prefixes, and may instead lease dynamic prefixes.

Most of the voices on the 6man wg mailing-list fell into one of the
following camps:

* "ISPs should be leasing stable prefixes -- if they don't, they are
asking for trouble!"

* "CPE routers should record leased prefixes on stable storage, such
that they can 'deprecate' such prefixes upon restart -- if they
don't, they are asking for trouble!"

* "No matter whose fault is this (if there is any single party to blame
in the first place), we should improve the robustness of IPv6
deployments"

Our Internet-Draft tries to improve the current state of affairs via the
following improvements:

* Allow hosts to gracefully recover from stale network configuration
information -- i.e., detect and discard stale network configuration
information

* Have SLAAC routers employ more appropriate timers, such that
information is phased-out in a timelier manner -- unless it is
actively refreshed by Router Advertisement messages

* Specify the interaction between DHCPv6-PD and SLAAC -- which was
rather under-specified

* Require CPE routers to store leased prefixes on stable storage, and
deprecate stale prefixes (if necessary) upon restart

We are looking forward to more input on the document (or any comments on
the issue being discussed), particularly from operators.

So feel free to send your comments on/off list as you prefer

Thanks!

Cheers,

William_Allen_Simps3 · March 9, 2019, 2:51pm

Folks,

If you follow the 6man working group of the IETF you may have seen a
bunch of emails on this topic, on a thread resulting from an IETF
Internet-Draft we published with Jan Žorž about "Reaction of Stateless
Address Autoconfiguration (SLAAC) to Renumbering Events" (Available at:
https://github.com/fgont/draft-slaac-renum/raw/master/draft-gont-6man-slaac-renum-02.txt
)

[...]

We are looking forward to more input on the document (or any comments on
the issue being discussed), particularly from operators.

So feel free to send your comments on/off list as you prefer

Thanks for bringing this to the attention of operators. Too few IETF
documents have operational considerations.

Masataka_Ohta · March 9, 2019, 11:02pm

Fernando Gont wrote:

There are a number of scenarios where SLAAC hosts may end up using stale
configuration information.

That's because SLAAC maintain address configuration state in
fully distributed manner without any authority, which is the
worst possible way to do so.

The only reasonable solution is to ban SLAAC.

Masataka Ohta

William_Herrin · March 10, 2019, 4:54pm

Hi Fernando,

I’m a little confused here. I can certainly see why the default timeout of 30 days is a problem, but doesn’t the host lose the route from the RA sooner? Why would an IPv6 host originate connections from an address for which it has no corresponding route? Isn’t that broken source address selection?

I’d love to see that addressed in your draft.

Obviously having the router always explicitly expire the old addresses is a non-starter. There’s no certainty that the router knows what the old addresses were, that it’s even the same piece of equipment or that all the hosts will see the packet if it does manage to send one.

Regards,
Bill Herrin

Fernando_Gont2 · March 10, 2019, 8:53pm

Hi, Bill,

Thanks for the feedback! In-line....

    If you follow the 6man working group of the IETF you may have seen a
    bunch of emails on this topic, on a thread resulting from an IETF
    Internet-Draft we published with Jan Žorž about "Reaction of Stateless
    Address Autoconfiguration (SLAAC) to Renumbering Events" (Available at:
    https://github.com/fgont/draft-slaac-renum/raw/master/draft-gont-6man-slaac-renum-02.txt
     )

Hi Fernando,

I'm a little confused here. I can certainly see why the default timeout
of 30 days is a problem, but doesn't the host lose the route from the RA
sooner?

Which route?

Configuration of addresses is mostly a different business than acquiring
routes. SO, in the typical scenario where the CPE crashes and reboots,
hosts will even have a default route -- advertised by the router that
crashed and rebooted.

If you are referring to the "on-link" route -- i.e., the route
introduced because the Prefix Information Option had the "L" bit set --
then I don't think there's anything in the standard to actually
grabage-collect such routes.

Why would an IPv6 host originate connections from an address for
which it has no corresponding route? Isn't that broken source address
selection?

Please see above.

The mechanism we specified in Section 5.1.3 of our draft tries to do
exactly that: Try to detect when a previously-advertised prefix has
become stale... and when it's inferred to be stale, just remove all the
corresponding information.

Regarding fixing this issue with source address selection: some have
suggested that his should be addressed in source address selection.
However, there are a number of problems with this.

If you prioritize addresses from the prefix that was last advertised,
then source addresses are guaranteed to flap -- and in the cause of
multi-prefix networks, this would become a troubleshooting nightmare.
Secondly, if you don't remove the on-link route for the stale-prefix,
then packets meant to the new "owners" of that prefix will be assumed to
be on-link, and hence communication will fail. This should probably be
an indication that the solution is not to avoid using the stale
information, but rather discarding it in a timelier manner.

Please do let me know if I've missed anything.

Thanks!

Cheers,