Dyn DDoS this AM?

amen.

anyone who relies on a single dns provider is just asking for stuff
such as this.

part of the problem is that we think of it as attack surface when, in
fact, it usually has more than two dimensions.

randy

anyone who relies on a single dns provider is just asking for stuff such
as this.

I'd love to hear how others are handling the overhead of managing two dns
providers.

good question. staying in-band, hidden primary comes to mind. but i am
sure clever minds can come up with more clever schemes.

randy

Patrick W. Gilmore wrote:

Our biggest problem is people thinking they cannot or do not want to
help.

Our biggest problem is that if the Internet community does not handle
problems like this, governments and regulators may decide to intervene.
If they do this in the wrong way, it will turn one major headache into two.

Nick

with the usual caveats - and I dont have any projects that currently need
this but have in the past - pretty much every major dns provider allows you
to ship them a full zone in some form or fashion. The effort to pull and
ship a zone should be fairly minimal in and of itself.

mixing your public zone providers in your authoritative NS records is also
easy - and, depending on your registrar of choice, should be easy to manage
changing those (including having non-public mirrors maintained that you can
switch too..). setting TTLs that make sense for a design that supports
change is also easy.

the real developmental and architectural challenges are around what to do
if the APIs you use to talk to your "primary" disappear and you need to
consume them (creating new host entries, updating loadbalancer pools,
whatever. we all have different and sometimes very diverse use cases for
dns.).

one approach - as randy suggested - is to switch to a purely hidden and
self managed primary - which might mean running your own API stack in front
of it to control whatever you need to control and change. this doesnt
need to be a "real" dns server in todays world - the days of BIND style
zone transfers are generally long gone anyway when you hit these scales and
levels of intra complexity. then your zone-replication components that
ship zone updates to your various external providers are shipping from the
same place.

at least in that case it's fully within your control - but dev time and
complexity definitely comes into play.

if your infra can survive internally without dns change control for the
extent of an outage, that could be much easier to manage.

anyway, random and incomplete thoughts - time ran out, work calls.

...david

Not all the ones you might choose based on scale support axfr... That's
a bit of a problem for the most traditional approach to this., of those
that do it's straight-forward to use one as the master for another, or
use a hidden master. Your own master may have demonstrably lower
availability then one or the other of your providers. getting two well
considered choices to play nice with each other isn't that hard.

anyone who relies on a single dns provider is just asking for stuff such as this.

I'd love to hear how others are handling the overhead of managing two dns providers.

* randy@psg.com (Randy Bush) [Sat 22 Oct 2016, 00:28 CEST]:

good question. staying in-band, hidden primary comes to mind. but i am sure clever minds can come up with more clever schemes.

The point of outsourcing DNS isn't just availability of static hostnames, it's the added services delivered, like returning different answers based on source of the question, even monitoring your infrastructure (or it reporting load into the DNS management system).

That is very hard to replicate with two DNS providers.

  -- Niels.

> anyone who relies on a single dns provider is just asking for stuff such
> as this.
>
> randy

I'd love to hear how others are handling the overhead of managing two dns
providers. Every time we brainstorm on it, we see it as blackhole of eng
effort WRT to keeping them in sync and and then waiting for TTLs to cut an
entire delegation over.

The fault is giving up the primary for an API connection. Sure, it is
tempting. We do, however, need to push the "application-integrated"
DNS vendors harder. They need to give their customers more choice in
how the DNS is populated.

They also very much need to let people with above-mentioned
"application-integrated" needs add third party DNS providers in the mix.
This diversity capability is what makes DNS resilient. Monocultures have
suboptimal survivability in the long run.

Adding DNS providers when you control the primary is completely
painless. With EDNS0 there's lots of room for insanely large NS RRSETs.

Also, do not fall in the "short TTL for service agility" trap.

Besides, what Randy wrote.

The point of outsourcing DNS isn't just availability of static hostnames,
it's the added services delivered, like returning different answers based on
source of the question, even monitoring your infrastructure (or it reporting
load into the DNS management system).

That is very hard to replicate with two DNS providers.

Surely, it must be better to use a singular service that is provably
easy to take out. The advantages are overwhelming.

I don't have a horse in this race, and haven't used it in anger, but Netflix released denominator to attempt to deal with some of these issues:

https://github.com/Netflix/denominator

Their goal is to support the highest common denominator of features among the supported providers,

Maybe that helps someone.

Keenan

Ansible would be a decent start.

* mansaxel@besserwisser.org (Måns Nilsson) [Sat 22 Oct 2016, 01:27 CEST]:
>Also, do not fall in the "short TTL for service agility" trap.

Several CDNs, Akamai among them, do use short TTLs for this exact reason.
Server load is constantly monitored and taken into account when crafting DNS
replies.

But the problem is that this trashes caching, and DNS does not work
without caches. At least not if you want it to survive when the going
gets tough.

If we're going to solve this we need to innovate beyond the pathetic
CNAME chains that todays managed DNS services make us use, and get truly
distributed load-balancing decision-making (which only will work if you
give it sensible data; a single CNAME is not sensible data) all the way
out in the client application.

Ah, disregard. I see what you're saying now.

Yes, I can see how that would be problematic.

Given the scale of these attacks, whether having two providers does any
good may be a crap shoot.

That is, what if the target happens to share the same providers you do?
Given the whole asymmetry of resources that make this a problem in the
first place, the attackers probably have the resources to take out multiple
providers.

Having multiple providers may reduce your chance of being collateral damage
(and I'd also still worry more about the more mundane risks of a single
provider, maintenance or upgrade gone bad, business risks, etc., than these
sensational ones), but multiple providers likely won't save you if you are
the actual target of the attack.

Good, perfect, enemy, etc.

How many sites were down today? How many were the intended target?

     -- Brett

Cuts both ways. Had Twitter had TTLs of say 7 days, vast majority
wouldn't notice an outage of a few hours because their local cache wa
still valid.

It does prevent one from reacting quickly to emergencies.

In practice TTLs tend to be ignored on the public internet. In past
research I've been involved with browser[0] behavior was effectively
random despite the TTL set.

[0] more specifically, the chain of DNS resolution and caching down to
the browser.

Yes, but that it can be both better and worse than your TTLs does not mean that you can ignore properly working implementations.

If the other end device chain breaks you that's their fault and out of your control. If your own settings break you that's your fault.

>
> [...]
>
> In practice TTLs tend to be ignored on the public internet. In past
> research I've been involved with browser[0] behavior was effectively
> random despite the TTL set.
>
> [0] more specifically, the chain of DNS resolution and caching down to
> the browser.

Yes, but that it can be both better and worse than your TTLs does not mean
that you can ignore properly working implementations.

If the other end device chain breaks you that's their fault and out of
your control. If your own settings break you that's your fault.

+1 to what George wrote that we should make efforts to improve our part of
the network. There are ISPs that ignore TTL settings and only update their
cached records every two to three days or even more (particularly the
smaller ones). OTOH, this results in your DNS data being inconsistent but
it’s very common to cache DNS records at multiple levels. It's an effort
that everyone needs to contribute to.

Sadly, it looks like the project is stalled: <https://github.com/Netflix/denominator/issues/374>.

And AS15135?