NIST IPv6 document

NIST has released SP800-119, "Guidelines for the Secure Deployment of
IPv6". While I don't agree with everything in it, it is an excellent
overview of IPv6, differences from IPv4, and security advice. While the
title sounds like a security document, the security implications are
only a part of it.

I've not finished reading it, but my first reaction is that this is a
good source of information. Well written, fairly detailed (at 188 pages)
with lots of references.

The PDF is available at:
http://csrc.nist.gov/publications/nistpubs/800-119/sp800-119.pdf

I notice that this document, in its nearly 200 pages, makes only
casual mention of ARP/NDP table overflow attacks, which may be among
the first real DoS challenges production IPv6 networks, and equipment
vendors, have to resolve. Some platforms have far worse failure modes
than others when subjected to such an attack, and information on this
subject is not widely-available.

Unless operators press their vendors for information, and more knobs,
to deal with this problem, we may all be waiting for some group like
"Anonymous" to take advantage of this vulnerability in IPv6 networks
with large /64 subnets configured on LANs; at which point we may all
find ourselves scrambling to request knobs, or worse, redesigning and
renumbering our LANs.

RFC5157 does not touch on this topic at all, and that is the sole
reference I see in the NIST publication to scanning attacks.

I continue to believe that a heck of a lot of folks are missing the
boat on this issue, including some major equipment vendors. It has
been pointed out to me that I should have been more vocal when IPv6
was still called IPng, but in 16 years, there has been nothing done
about this problem other than water-cooler talk. I suspect that will
continue to be the case until those of us who have configured our
networks carefully are having a laugh at the networks who haven't.
However, until that time, it's also been pointed out to me that
customers will expect /64 LANs, and not offering it may put networks
at a competitive disadvantage.

Vendor solutions are needed before scanning IPv6 LANs becomes a
popular way to inconvenience (at best) or disable (at worst) service
providers and their customers.

Dear Jeff,
   In my opinion the real challenges already in IPv6 networks the following: SPAM and attacking over IPv6; DoS; track back hosts with privacy enhanced addresses.
   Do you have some methods in your mind to resolve ARP/ND overflow problem? I think limiting mac address per port on switches both efficient on IPv4 and IPv6. Equivalent of DHCP snooping and Dynamic ARP Inspection should be implemented by the switch vendors.... But remember DHCP snooping et al. implemented in IPv4 after the first serious attacks...Make pressure on your switch vendors....

Janos Mohacsi
Head of HBONE+ project
Network Engineer, Deputy Director of Network Planning and Projects
NIIF/HUNGARNET, HUNGARY
Key 70EF9882: DEC2 C685 1ED4 C95A 145F 4300 6F64 7B00 70EF 9882

I notice that this document, in its nearly 200 pages, makes only casual mention of ARP/NDP table overflow attacks, which may be among
the first real DoS challenges production IPv6 networks, and equipmentvendors, have to resolve.

They also only make small mention of DNS- and broadcast-hinted scanning, and none at all of routing-hinted scanning.

It has been pointed out to me that I should have been more vocal when IPv6 was still called IPng, but in 16 years, there has been nothing done
about this problem other than water-cooler talk.

Likewise. I never in my wildest dreams thought that such a bag of hurt, with all the problems of IPv4 *plus* its own inherent problems - in *hex*, no less - would end up being adopted. I was sure that the adults would step in, at some point, and get things back on a more sensible footing.

Obviously, I'm the biggest idiot on the Internet, and have only my own misplaced faith in the IAB/IETF process to blame, heh.

The authors of the document also make only small mention of the dangers of extension header-driven DoS for infrastructure, but at least they mention it, which puts them ahead of most folks in this regard.

They also fail to mention the dangers represented by the consonance of the English letters 'B', 'C', 'D', and 'E'. My guess it that billions of USD in outages, misconfigurations, and avoidable security incidents will result from verbal miscommunication of these letters, yet another reason why adopting a hexadecimal numbering scheme was foolish in the extreme. Ah, well, no use crying over spilt milk.

The document itself is a good tutorial on IPv6, and it's great that the authors did indeed touch upon these security concerns, but the security aspect as a whole is seemingly deliberately understated, which does a disservice to the lay reader. One can only imagine that there were non-technical considerations which came into play.

I meant to include, ' . . . and the strain that this hinted scanning will place on the DNS and routing/switching infrastructure.'

That almost sounds like a conspiracy theory, let me know when it shows
up on Wikileaks. :slight_smile:

I think it's better to show what is broken and let vendors fix it, then
to look the other way.

The only people I know actively and openly working on creating tests to
find and report bugs in IPv6 protocols and software is the
"THC-IPV6"-project by "van Hauser".

Here is an old presentation from 2005 from him:

http://media.ccc.de/browse/congress/2005/22C3-772-en-attacking_ipv6.html

http://events.ccc.de/congress/2005/fahrplan/attachments/642-vh_thcipv6_attackccc05presentation.pdf

Most is still possible and not fixed to this date.

And his site:

http://www.thc.org/thc-ipv6

He did a new presentation at 27c3 in december 2010:

http://events.ccc.de/congress/2010/Fahrplan/events/3957.en.html

A video and slides should show up on the list soon:

http://media.ccc.de/tags/27c3.html

(because of audio transcoding issues some videos are not online right
now, if you ask me nicely I could mail a link for the video from before
they took it down)

Have a nice day,
    Leen Besselink.

That talk is available on Youtube by the official account
http://www.youtube.com/watch?v=c7hq2q4jQYw

Equipment vendors, and most operators, seem to be silent on this
issue, not wishing to admit it is a serious problem, which would seem
to be a required step before addressing it.

Without more knobs on switches or routers, I believe there are only
two possible solutions for production networks today:
1) do not configure /64 LANs, and instead, configure smaller subnets
which will reduce the impact of scanning attacks
This is not desirable, as customers may be driven to expect a /64, or
even believe it is necessary for proper functioning. I brought this
up with a colleague recently, who simply pointed to the RFC and said,
"that's the way you have to do it." Unfortunately, configuring the
network the way the standard says, and accepting the potential DoS
consequences, will likely be less acceptable to customers than not
offering them /64 LAN subnets. This is a foolish position and will
not last long once reality sets in, unless vendors provide more knobs.

2) use link-local addressing on LANs, and static addressing to end
hosts. This prevents a subset of attacks originated from "the
Internet," by making it impossible for NDP to be initiated by scanning
activity; but again, is not what customers will expect. It may have
operational disadvantages with broken user-space software, is not easy
for customers to configure, and does not permit easy porting of
addresses among host machines. It requires much greater configuration
effort, is likely not possible by way of DHCP. It also does not solve
NDP table overflow attacks initiated by a compromised host on the LAN,
which makes it a half-way solution.

The knobs/features required to somewhat-mitigate the impact of an NDP
table overflow attack are, at minimum:
* keep NDP/ARP entry alive based on normal traffic flow, do not expire
a host that is exchanging traffic
  + this is not the case with some major platforms, it surprised me to
learn who does not do this
  + may require data plane changes on some boxes to inform control
plane of on-going traffic from active addresses
* have configurable per-interface limits for NDP/ARP resource
consumption, to prevent attack on one interface/LAN from impacting all
interfaces on a router
  + basically no one has this capability
  + typically requires only control plane modifications
* have configurable minimum per-interface NDP/ARP resource reservation
  + typically requires only control plane modifications
* have per-interface policer for NDP/ARP traffic to prevent control
plane from becoming overwhelmed
  + because huge subnets may increase the frequency of scanning
attacks, and breaking one interface by reaching a policer limit is
much better than breaking the whole box if it runs out of CPU, or
breaking NDP/ARP function on the whole box if whole-box policer is
reached
* learn new ARP/NDP entry when new transit traffic comes from a host on the LAN
  + even if NDP function is impared on the LAN due to on-going scan attack
  + again, per-interface limitations must be honored to protect whole
box from breaking from one misconfigured / malicious LAN host
* have sane defaults for all and allow all to be modified as-needed

I am sure we can all agree that, as IPv6 deployment increases, many
unimagined security issues may appear and be dealt with. This is one
that a lot of smart people agree is a serious design flaw in any IPv6
network where /64 LANs are used, and yet, vendors are not doing
anything about it. If customers don't express this concern to them,
they won't do anything about it until it becomes a popular DoS attack.

In addition, if you design your network around /64 LANs, and
especially if you take misguided security-by-obscurity advice and
randomize your host addresses so they can't be found in a practical
time by scanning, you may have a very difficult time if the ultimate
solution to this must be to change the typical subnet size from /64 to
something that can fit within a practical NDP/ARP table.

Deploying /64 networks because customers demand it and your
competitors are doing it is understandable. Doing it "because it's
the standard" is very stupid. Anyone doing that on, for example,
SONET interfaces or point-to-point Ethernet VLANs in their
infrastructure, is making a bad mistake. Doing it toward CE routers,
the sort that have an IPv4 /30, is even more foolish; and many major
ISPs already know this and are using small subnets such as /126 or
/124.

If you are still reading, but do not have any idea what I'm talking
about, ask yourself these questions:
1) do I know what happens when my router's ARP table gets 100% full?
2) do I know what happens to my ARP/NDP functionality if my router
receives a 20k PPS random scan towards an attached IPv6 subnet? will
it eat all my CPU and drop my BGP, or just make it impossible to learn
new ARP/NDP entries? will it eventually allow old entries to expire
such that they perhaps cannot be re-learnt?
3) am I deploying IPv6 in a way that is vulnerable to a trivial attack method?
4) will my network design need fundamental change if my equipment
vendor does not add necessary knobs?

This is a very serious problem which our industry is actively
ignoring, hoping it will just go away. If you are in the group who
believes it is a non-issue, I urge you to take your head out of the
sand. If you are waiting for your vendor to add more knobs or come up
with a magic solution, stop clapping your hands and saying "I believe
in fairies," and express your concern to your vendor sales channel.
If you are a black-hat script kiddie, please go ahead and start
scanning attacks now, while IPv6 largely does not matter and
dual-stack infrastructure is somewhat limited (although there will be
some spill-over to IPv4 on dual-stack boxes) to motivate change.

Finally, if you operate a major IXP with a /64 peering LAN, please
explain why this is in any way better than operating the same LAN with
a subnet similar in size to its existing IPv4 subnets, e.g. a /120.

Using /64s is insane because a) it's unnecessarily wasteful (no lectures on how large the space is, I know, and reject that argument out of hand) and b) it turns the routers/switches into sinkholes.

customers may be driven to expect a /64, or
even believe it is necessary for proper functioning.

RFC 3513 says:

   For all unicast addresses, except those that start with binary value
   000, Interface IDs are required to be 64 bits long and to be
   constructed in Modified EUI-64 format.

Nobody has been able or willing to tell why that's in there, though.

All the same, beware of the anycast addresses if you want to use a smaller block for point-to-point and for LANs, you break stateless autoconfig and very likely terminally confuse DHCPv6 if your prefix length isn't /64.

This is one
that a lot of smart people agree is a serious design flaw in any IPv6
network where /64 LANs are used

It's not a design flaw, it's an implementation flaw. The same one that's in ARP (or maybe RFC 894 wasn't published on april first by accident after all). And the internet managed to survive.

A (relatively) easy way to avoid this problem is to either use a stateful firewall that only allows internally initiated sessions, or a filter that lists only addresses that are known to be in use.

and yet, vendors are not doing anything about it.

Then don't give them your business.

And maybe a nice demonstration on stage at a NANOG meeting will help a bit?

In addition, if you design your network around /64 LANs, and
especially if you take misguided security-by-obscurity advice and
randomize your host addresses so they can't be found in a practical
time by scanning, you may have a very difficult time if the ultimate
solution to this must be to change the typical subnet size from /64 to
something that can fit within a practical NDP/ARP table.

Sparse subnets in IPv6 are a feature, not a bug. They're not going to go away.

Except someone was kind enough to develop a protocol that requires /64 to work. So then there is the SLAAC question. When might it be used?

With routers, I usually don't use SLAAC. The exception is end user networks, which makes using SLAAC + DHCPv6-PD extremely dangerous for my edge routers. DHCPv6 IA_TA + DHCPv6-PD would be more sane, predictable, and filterable (and support longer than /64) thought my current edge layout can't support this (darn legacy IOS).

I would love a dynamic renumbering scheme for routers, but until all routing protocols (especially iBGP) support shifting from one prefix to the next without a problem, it's a lost cause and manual renumbering is still required. Things like abstracting the router id from the transport protocol would be nice. I could be wrong, but I think ISIS is about it for protocols that won't complain.

All that said, routers should be /126 or similar for links, with special circumstances and layouts for customer edge.

For server subnets, I actually prefer leaving it /64 and using SLAAC with token assignments. This is easily mitigated with ACLs to filter any packets that don't fall within the range I generally use for the tokens, with localized exceptions for non-token devices which haven't been fully initialized yet (ie, stay behind stateful firewall until I've changed my IP to prefix::0-2FF). I haven't tried it, but I highly suspect it would fail, but it would be nice to use SLAAC with longer than /64.

Jack

that a lot of smart people agree is a serious design flaw in any IPv6
network where /64 LANs are used

It's not a design flaw, it's an implementation flaw. The same one that's in ARP (or maybe RFC 894 wasn't published on april first by accident after all). And the internet managed to survive.

It appears you want to have a semantic argument. I could grant that,
and every point in my message would still stand. However, given that
the necessary knobs to protect the network with /64 LANs do not exist
on any platform today, vendors are not talking about whether or not
they may in the future, and that no implementation with /64 LANs
connected to the Internet, or any other routed network which may have
malicious or compromised hosts, "design flaw" is correct.

This is a much smaller issue with IPv4 ARP, because routers generally
have very generous hardware ARP tables in comparison to the typical
size of an IPv4 subnet. You seem to think the issue is generating NDP
NS. While that is a part of the problem, even if a router can
generate NS at an unlimited rate (say, by implementing it in hardware)
it cannot store an unlimited number of entries. The failure modes of
routers that have a full ARP or NDP table obviously vary, but it is
never a good thing. In addition, the high-rate NS inquiries will be
received by some or all of the hosts on the LAN, consuming their
resources and potentially congesting the LAN. Further, if the
router's NDP implementation depends on tracking the status of
"incomplete" on-going inquiries, the available resource for this can
very easily be used up, preventing the router from learning about new
neighbors (or worse.) If it does not depend on that, and blindly
learns any entry heard from the LAN, then its NDP table can be totally
filled by any compromised / malicious host on the LAN, again, breaking
the router. Either way is bad.

This is a fundamentally different and much larger problem than those
experienced with ARP precisely because the typical subnet size is now,
quite literally, seventy-quadrillion times as large as the typical
IPv4 subnet.

A (relatively) easy way to avoid this problem is to either use a stateful firewall that only allows internally initiated sessions, or a filter that lists only addresses that are known to be in use.

It would certainly be nice to have a stateful firewall on every single
LAN connection. Were there high-speed, stateful firewalls in 1994?
Perhaps the IPng folks had this solution in mind, but left it out of
the standards process. No doubt they all own stock in SonicWall and
are eagerly awaiting the day when "Anonymous" takes down a major ISP
every day with a simple attack that has been known to exist, but not
addressed, for many years.

You must also realize that the stateful firewall has the same problems
as the router. It must include a list of allocated IPv6 addresses on
each subnet in order to be able to ignore other traffic. While this
can certainly be accomplished, it would be much easier to simply list
those addresses in the router, which would avoid the expense of any
product typically called a "stateful firewall." In either case, you
are now maintaining a list of valid addresses for every subnet on the
router, and disabling NDP for any other addresses. I agree with you,
this knob should be offered by vendors in addition to my list of
possible vendor solutions.

that a lot of smart people agree is a serious design flaw in any IPv6
network where /64 LANs are used

It's not a design flaw, it's an implementation flaw. The same one that's in ARP (or maybe RFC 894 wasn't published on april first by accident after all). And the internet managed to survive.

It appears you want to have a semantic argument. I could grant that,
and every point in my message would still stand. However, given that
the necessary knobs to protect the network with /64 LANs do not exist
on any platform today, vendors are not talking about whether or not
they may in the future, and that no implementation with /64 LANs
connected to the Internet, or any other routed network which may have
malicious or compromised hosts, "design flaw" is correct.

This is a much smaller issue with IPv4 ARP, because routers generally
have very generous hardware ARP tables in comparison to the typical
size of an IPv4 subnet.

no it isn't, if you've ever had your juniper router become unavailable
because the arp policer caused it to start ignoring updates, or seen
systems become unavailable due to an arp storm you'd know that you can
abuse arp on a rather small subnet.

These conditions can only be triggered by malicious hosts on the LAN.
With IPv6, it can be triggered by scanning attacks originated from
"the Internet." No misconfiguration or compromised machine on your
network is necessary.

This is why it is a fundamentally different, and much larger, problem.
Since you seem confused about the basic nature of this issue, I will
explain it for you in more detail:

IPv4) I can scan your v4 subnet, let's say it's a /24, and your router
might send 250 ARP requests and may even add 250 "incomplete" entries
to its ARP table. This is not a disaster for that LAN, or any others.
No big deal. I can also intentionally send a large amount of traffic
to unused v4 IPs on the LAN, which will be handled as unknown-unicast
and sent to all hosts on the LAN via broadcasting, but many boxes
already have knobs for this, as do many switches. Not good, but also
does not affect any other interfaces on the router.

IPv6) I can scan your v6 /64 subnet, and your router will have to send
out NDP NS for every host I scan. If it requires "incomplete" entries
in its table, I will use them all up, and NDP learning will be broken.
Typically, this breaks not just on that interface, but on the entire
router. This is much worse than the v4/ARP sitation.

I trust you will understand the depth of this problem once you realize
that no device has enough memory to prevent these attacks without
knobs that make various compromises available via configuration.

Jeff Wheeler (jsw) writes:

IPv4)

  [...]

Not good, but also does not affect any other interfaces on the router.

  You're assuming that all routing devices have per-interface ARP tables.

IPv6)
Typically, this breaks not just on that interface, but on the entire
router. This is much worse than the v4/ARP sitation.

  Inverse assumption here.

  Doesn't change much to the case scenario you've put forward
  as a cause to the problem, but still wanted to point it out.

  Cheers,
  Phil

I haven't checked of late for v6, but I'd expect the same NDP security we have for ARP these days, which reduces the need to even send unsolicited ND requests.

In this day and age, sending unsolicited neighbor requests from a router seems terribly broken. Even with SLAAC, one could quickly design a model that doesn't require unsolicited ND from the router to find the remove computer. This could possibly utilize DAD checks or even await the first packet from the node (similar to how we fill our MAC forwarding tables in switches, and not all switches will broadcast when a MAC is unknown).

Jack

IPv6) I can scan your v6 /64 subnet, and your router will have to send
out NDP NS for every host I scan. If it requires "incomplete" entries
in its table, I will use them all up, and NDP learning will be broken.
Typically, this breaks not just on that interface, but on the entire
router. This is much worse than the v4/ARP sitation.

I'm guessing you're referring to this paragraph of RFC 4861:
"
   When a node has a unicast packet to send to a neighbor, but does not
   know the neighbor's link-layer address, it performs address
   resolution. For multicast-capable interfaces, this entails creating
   a Neighbor Cache entry in the INCOMPLETE state and transmitting a
   Neighbor Solicitation message targeted at the neighbor. The
   solicitation is sent to the solicited-node multicast address
   corresponding to the target address.
"
<http://tools.ietf.org/html/rfc4861#section-7.2.2>

It's worth noting that nothing in this paragraph is normative (there's
no RFC 2119 language), so implementations are free to ignore it. I
haven't read the NIST document, but it wouldn't conflict with the RFC
if they recommended ignoring this paragraph and just relying on the ND
cache they already have when a packet arrives.

--Richard

No, Phil, I am assuming that the routing device has a larger ARP table
than 250 entries. To be more correct, I am assuming that the routing
device has a large enough ARP table that any one subnet could go from
0 ARP entries to 100% ARP entries without using up all the remaining
ARP resources on the box. This is usually true. Further, routing
devices usually have enough ARP table space that every subnet attached
to them could be 100% full of active ARP entries without using up all
the ARP resources. This is also often true.

To give some figures, a Cisco 3750 "pizza box" layer-3 switch has room
for several thousand ARP entries, and I have several with 3000 - 5000
active ARPs. Most people probably do not have more than a /20 worth
of subnets for ARPing to a pizza box switch like this, but it does
basically work.

As we all know, a /64 is a lot more than a few thousand possible
addresses. It is more addresses than anyone can store in memory, so
state information for "incomplete" can't be tracked in software
without creating problems there. Being fully stateless for new
neighbor learning is possible and desirable, but a malicious host on
the LAN can badly impact the router. This is why per-interface knobs
are badly needed. The largest current routing devices have room for
about 100,000 ARP/NDP entries, which can be used up in a fraction of a
second with a gigabit of malicious traffic flow. What happens after
that is the problem, and we need to tell our vendors what knobs we
want so we can "choose our own failure mode" and limit damage to one
interface/LAN.

Jeff Wheeler (jsw) writes:

are badly needed. The largest current routing devices have room for
about 100,000 ARP/NDP entries, which can be used up in a fraction of a
second with a gigabit of malicious traffic flow. What happens after
that is the problem, and we need to tell our vendors what knobs we
want so we can "choose our own failure mode" and limit damage to one
interface/LAN.

  Well there are *some* knobs:

  http://www.cisco.com/en/US/docs/ios/ipv6/configuration/guide/ip6-addrg_bsc_con.html#wp1369018

  Not very smart, as it just controls how fast you run out of entries.

  I haven't read all entries in this thread yet, but I wonder if
  http://tools.ietf.org/html/draft-jiang-v6ops-nc-protection-01 has been
  mentioned ?

  Seems also that this topic has been brought up here a year ago give
  or take a couple of weeks:

  http://www.mail-archive.com/nanog@nanog.org/msg18841.html

  Cheers,
  Phil

IPv4) I can scan your v4 subnet, let's say it's a /24, and your router
might send 250 ARP requests and may even add 250 "incomplete" entries
to its ARP table. This is not a disaster for that LAN, or any others.
No big deal. I can also intentionally send a large amount of traffic
to unused v4 IPs on the LAN, which will be handled as unknown-unicast
and sent to all hosts on the LAN via broadcasting, but many boxes
already have knobs for this, as do many switches. Not good, but also
does not affect any other interfaces on the router.

IPv6) I can scan your v6 /64 subnet, and your router will have to send
out NDP NS for every host I scan. If it requires "incomplete" entries
in its table, I will use them all up, and NDP learning will be broken.
Typically, this breaks not just on that interface, but on the entire
router. This is much worse than the v4/ARP sitation.

Many would argue that the version of IP is irrelevant, if you are permitting
external hosts the ability to scan your internal network in an unrestricted
fashion (no stateful filtering or rate limiting) you have already lost, you
just might not know it yet.

Even granting that, for the sake of argument - it seems like it would not be
hard for $vendor to have some sort of "emergency garbage collection"
routines within their NDP implementations ... ?

/TJ