Failover IPv6 with multiple PA prefixes (Was: IPv6 fc00::/7 - Unique local addresses)

Tim_Franklin · November 2, 2010, 10:51am

Your home gateway that talks to your internet connection can either
get it via DHCP-PD or static configuration. Either way, it could
(should?) be set up to hold the prefix until it gets told something
different, possibly even past the advertised valid time.

That breaks the IPv6 spec. Preferred and valid lifetimes are there
for a reason.

And end-users want things to Just Work. The CPE vendor that finds a hack that lets the LAN carry on working while the WAN goes away and manages to slap the "With Home Network Resilience!" label on the box correctly will presumably do quite nicely out of it.

For this kind of site, I can't see what is *actually* going to break if the CPE keeps sending RAs for the prefix beyond the valid lifetime while the WAN is down. As long as it advertises a short valid lifetime itself, such that if the real prefix changes[0] when the WAN comes back up it can renumber everything on the LAN quickly, it looks a lot like a "Just Works" scenario to me...

Regards,
Tim.

[0] Which it won't, of course, because residential users are going to get proper static connections by default, rather than another round of "business class" price-gouging

Karl_Auer · November 2, 2010, 11:55am

But - preferred and valid lifetimes do *exactly that*. The address is
fully usable up to the end of the preferred lifetime. It is then
deprecated (but not unusable) until the end of the valid lifetime. Only
after the valid lifetime does it become unusable. DHCPv6 lifetimes are
exactly the same as RA lifetimes - and of course there is nothing that
says the RA lifetimes have to be the same as the DHCPv6 lifetimes
(though some sensible relationship would be advisable).

So loss of connectivity to the upstream is not going to blow away a home
network. It will keep working fine, even if the upstream goes away for a
while. It's up to the upstream to use lifetimes that are a good
compromise between flexibility and stability.

About the only hack I can see that *might* make sense would be that home
CPE does NOT honour the upstream lifetimes if upstream connectivity is
lost, but instead keeps the prefix alive on very short lifetimes until
upstream connectivity returns.

Regards, K.

Mark_Smith1 · November 2, 2010, 12:53pm

>> Your home gateway that talks to your internet connection can either
>> get it via DHCP-PD or static configuration. Either way, it could
>> (should?) be set up to hold the prefix until it gets told something
>> different, possibly even past the advertised valid time.
>
> That breaks the IPv6 spec. Preferred and valid lifetimes are there
> for a reason.

And end-users want things to Just Work.

And I want their networks to work so well that I don't even want them
to rely in an ISPs addressing being available, valid, or even having an
ISP - which could easily be the case if they go and sign up for a new
broadband service, bring home brand new CPE, yet don't get ISP service
connectivity for 5 to 10 business days. Surely they should be able to
hook up their internal network and have their TV talking to their
computer or NAS during this period without an Internet service. The ISP
in question may not be prepared to give them a permanent GUA address at
the time of sign up, because the ISP may wish to have static addressing
as a product distinguisher for SOHO/SME products, or have the
flexibility of phasing semi-dynamic addressing in and out over time to
suit their IPv6 address management requirements.

The CPE vendor that finds a hack that lets the LAN carry on working

while the WAN goes away and manages to slap the "With Home Network
Resilience!" label on the box correctly will presumably do quite
nicely out of it.

For this kind of site, I can't see what is *actually* going to break if the CPE keeps sending RAs for the prefix beyond the valid lifetime while the WAN is down. As long as it advertises a short valid lifetime itself, such that if the real prefix changes[0] when the WAN comes back up it can renumber everything on the LAN quickly, it looks a lot like a "Just Works" scenario to me...

Prefix lifetimes don't work that way - there is no such thing as a
"flash" renumbering. The goal was to be able to phase new
addressing in, transition to it as either older communcations
sessions cease (e.g. TCP connections), or new ones are established, then
phase out the old addressing over a more significant time period than
one measured in minutes or seconds.

Regards,
Mark.

Karl_Auer · November 2, 2010, 1:25pm

The lifetimes are reset with every RA the nodes see. If I reconfigure my
router to start sending out RAs every N seconds, it will take a a
maximum of N seconds for a new preferred lifetime to be established on
all active nodes on the link. If the new preferred lifetime is zero, any
addresses in the prefix will be deprecated immediately, causing other
prefixes on the link to be preferred.

The new valid lifetime will be the remaining valid lifetime (if less
than two hours), the newly advertised valid lifetime (if using SEND), or
two hours in all other cases.

That seems pretty close to "flash renumbering"... at least for SLAAC.
DHCPv6 needs more planning.

Regards, K.

Owen_DeLong · November 2, 2010, 4:03pm

Which is exactly what was being proposed when Tim responded that it
would break the IPv6 spec.

Owen

Mark_Smith1 · November 2, 2010, 9:20pm

> Prefix lifetimes don't work that way - there is no such thing as a
> "flash" renumbering.

The lifetimes are reset with every RA the nodes see. If I reconfigure my
router to start sending out RAs every N seconds, it will take a a
maximum of N seconds for a new preferred lifetime to be established on
all active nodes on the link. If the new preferred lifetime is zero, any
addresses in the prefix will be deprecated immediately, causing other
prefixes on the link to be preferred.

The new valid lifetime will be the remaining valid lifetime (if less
than two hours), the newly advertised valid lifetime (if using SEND), or
two hours in all other cases.

That seems pretty close to "flash renumbering"...

I consider "flash renumbering" to mean that addressing can be changed
without disrupting established and ongoing communications sessions e.g.
doesn't break TCP connection or UDP streams.

I know that renumbering without disrupting transport protocols is
fundamentally impossible to achieve regardless of what the IPv6
preferred and valid lifetimes are because transport protocols are using
locators as identifiers. However, the goal should be to make transient
network faults, such as a broadband service link flap, have as minimal
impact as possible. Changing addresses every time that type of fault
occurs makes the consequences higher for transient faults than they
need to be.

I've had a recent experience of this. Some IPv6 CPE I was
testing had a fault where it dropped out and recovered every 2 minutes
- a transient network fault. I was watching a youtube video over IPv6.
Because of the amount of video buffering that took place, and because
the same IPv6 prefixes were assigned to the connection once it
recovered, the youtube video kept playing. That was a great end-user
experience and it was somewhat addictive to watch the PPP light
go off and come back on while the video kept playing faultlessly.

Some people argue that applications should be built to deal with this
type of situation. I think that is again asking application developers
to expend effort overcoming what are networking layer faults, as it has
been with NAT. I think problems are best solved where they're caused,
not necessarily where their effects are worst felt. I think it's better
to hide transient network faults from applications than have to make
each and every application include code to deal with them. The time
spent writing that code could be better spent on bug fixing, improving
the application functions themselves, or writing a different one.

Regards,
Mark.

Karl_Auer · November 2, 2010, 10:26pm

Yes it does. But as long as there is no upstream connectivity, it
doesn't matter. Personally I don't think it makes a *lot* of sense, but
it does make some.

Regards, K.

Sven_Olaf_Kamphuis · November 3, 2010, 4:14am

I've had a recent experience of this. Some IPv6 CPE I was
testing had a fault where it dropped out and recovered every 2 minutes
- a transient network fault. I was watching a youtube video over IPv6.
Because of the amount of video buffering that took place, and because
the same IPv6 prefixes were assigned to the connection once it
recovered, the youtube video kept playing. That was a great end-user
experience and it was somewhat addictive to watch the PPP light
go off and come back on while the video kept playing faultlessly.

thats primarily due to "partial http downloads" aka http status 206 rather than 202 where you can just specify at which offset in the file you want the httpd to start reading the file to you, most flash movie players, however, don't support this. connection lost = movie has to be fully reloaded.

Mark_Smith1 · November 3, 2010, 8:52am

There's a whole lot of speculation and no evidence in that
statement ... as it said, it was faultless, so I very strongly doubt
there was any restarting the stream.

Owen_DeLong · November 3, 2010, 9:47am

Sounds like we're in violent agreement.

Owen