Exploits start against flaw that could hamstring huge swaths of Internet | Ars Technica

Everyone got BIND updated?

http://arstechnica.com/security/2015/08/exploits-start-against-flaw-that-could-hamstring-huge-swaths-of-internet/

a message of 6 lines which said:

Everyone got BIND updated?

For instance by replacing it with NSD or Unbound?

always great to jump ship from one platform to another ... under
stress and without knowing scaling, management, etc properties.

Also, it's not like the alternatives have clean shorts when it comes
to code mistakes, right?

Or doing something better like not just replacing one evil with another,
and instead moving to a heterogeneous environment where possible.

... JG

So, you guys recommend replace Bind for another option ?

-----Mensagem original-----

The humorous thing is that the security researcher who showed the
recent bind9 error (note: it isn't a vulnerability or a hack, it's
just a way to remotely crash named), well, he criticized bind9 for
doing more than simple basic name services. So, it's very easy to
find bind9 alternatives if you are only looking for basic minimal DNS
functionality. But once you start looking for features... well there
aren't many options.

-Jim P.

The *good* recommendation is to get some onboard security clue, and
learn procedures to mitigate the inevitable exploits against flaws in
infrastructure software.

So, you guys recommend replace Bind for another option ?

No. Replacing one occasionally faulty product with another occasionally
faulty product is foolish. There's no particular reason to think that
another product will be impervious to code bugs. What I was suggesting
was to use several different devices, much as some networks prefer to
buy some Cisco gear and some Juniper gear and make them redundant, or
as a well-built ZFS storage array consists of drives from different
manufacturers.

Heterogeneous environments tend to be more resilient because they are
less likely to all suffer the same defect at once. Problems still result
in some pain and trouble, but it usually doesn't result in a service
outage.

This doesn't seem like a horribly catastrophic bug in any case. Anyone
who is reliant on a critical bit like a DNS server probably has it set
up to automatically restart if it doesn't exit cleanly. If you don't,
you should!

So if it matters to you, I suggest that you instead use a combination
of different products, and you'll be more resilient. If you have two
recursers for your customers, one can be BIND and one can be Unbound.
And when some critical vuln comes along and knocks out Unbound, you'll
still be resolving names. Ditto BIND. You're not likely to see both
happen at the same time.

However, at least here, we actually *use* TSIG updates, and other
functionality that'd be hard to replace (BIND9 is pretty much THE only
option for some functionality).

... JG

With the (large) caveat that heterogenous networks are more subject to
human error in many cases.

With the (large) caveat that heterogenous networks are more subject to
human error in many cases.

<cough>automate!</cough>

Automation just means your mistake goes many more places more quickly.

I recommend using DNSDIST to balance traffic at a protocol level as you can have implementation diversity on the backside.

I can send an example config out later for people. You can balance to bind NSD and others all at the same time :slight_smile: just move your SPoF

Jared Mauch

Automation just means your mistake goes many more places more quickly.

and letting people keep poking at things that computers should be
doing is... much worse. people do not have reliability and
repeat-ability over time.

If you fear 'many more places' problems, improve your testing.

I recommend using DNSDIST to balance traffic at a protocol level as you can h=
ave implementation diversity on the backside.=20

I can send an example config out later for people. You can balance to bind N=
SD and others all at the same time :slight_smile: just move your SPoF

Jared Mauch

Unless the same client hits the same server all the time this is a
bad idea.

Resolvers actually track capabilities of servers as it is the only
way to get answers due to firewalls dropping legitimate packet and
protocol misimplementations. Add to that different vendors /
versions supporting different extensions randomly flipping between
vendors / versions is frought with danger unless you take extreme
care.

With the (large) caveat that heterogenous networks are more subject to
human error in many cases.

Indeed. Everything comes with tradeoffs. More intimate familiarity
with the product and a uniformity of deployment strategy has made it
more practical here to stick with BIND; an update is a simple matter
of a tarball and running a script that manages the dirty work.

However, the original point was that switching from BIND to Unbound
or other options is silly, because you're just trading one codebase
for another, and they all have bugs. However, collectively, two
different products cooperatively providing a service are likely to
have a higher uptime in a well-designed environment.

... JG

> I recommend using DNSDIST to balance traffic at a protocol level as you
can h=
> ave implementation diversity on the backside.=20
>
> I can send an example config out later for people. You can balance to
bind N=
> SD and others all at the same time :slight_smile: just move your SPoF

Unless the same client hits the same server all the time this is a
bad idea.

But tying a set of clients to the same backend puts them all in the same
failure domain....

Resolvers actually track capabilities of servers as it is the only

way to get answers due to firewalls dropping legitimate packet and
protocol misimplementations. Add to that different vendors /
versions supporting different extensions randomly flipping between
vendors / versions is frought with danger unless you take extreme
care.

Out of curiosity, do any resolvers other than BIND do this? I ask because
BIND has a reputation for having "too many" features, and I wonder if this
is one of them.

Damian

It is equally silly to assume that all codebase are the same quality and
have equally many bugs. Maybe we should be looking at the track record of
those two products and maybe we should let someone do a code review. And
then choose based on that.

Regards,

Baldur

Not necessarily.

The sort of failure you're talking about, Scott, is "user did the wrong
thing", and sure, automation makes it easier for that to spread.

Chris was, though, I think, suggesting automating around "user tries to do
the right thing on disjoint devices, and fails *because they're disjoint*";
that is, clearly, a problem automation can help with.

Cheers,
-- jra

I don't disagree, but automation usually protects against typing errors, it
doesn't protect against incorrect configurations. Using multiple vendors
or server software means that your people have to know all of the systems.
There are many cases where, for example, a Cisco like CLI will make a
network engineer think that a command works exactly the same way on another
vendors system when in fact the under the hood implementation is very
different.

It's not always feasible to have the people with the needed skill levels
and automation does not help that at all.

I've personally never come across an accidental route hijack (of the subset of which I learned the actual details of what happened) that wasn't the result of someone manually typing at the enable prompt.