maybe a dumb idea on how to fix the dns problems i don't know....

Paul,

Sorry if this is real stupid for some reason because I don't think about DNS all day (I'm the ldap dude) but since we have faster networks and faster cpus today, what would be the harm in switching to use TCP for DNS clients? The latency on the web isn't dns anymore ever it seems to me.....

Wouldn't that eliminate the ability to poison clients' caches?

any why wouldn't persistent client-server dns connections make sense? any stupid little bsd or linux box can handle several thousand connections today pretty easily if tuned correctly by some reasonably competent primate

CP

TCP would work, but it makes it more difficult to do Anycast, which
works well with UDP and DNS.

Chuck

TCP works pretty well with anycast too, if you're careful. It's helpful if your transactions are short-lived.

I've seen concern expressed that a server which can handle 100,000 qps over UDP might well fare substantially more poorly if every query arrives using TCP transport. The business of large-scale HTTP is a fairly well-understood problem, however, and has some similar characteristics, so perhaps this is less of a long-term problem. I don't know, I haven't run any experiments to figure out the practical impact on performance of using TCP exclusively.

There is, however, the practical consideration that a generation of firewall "administrators" seem to believe that 53/tcp is only ever used for zone transfers, and can safely be closed for all other use.

I suspect that a route with better practical implications will be for resolvers to pad queries with additional entropy as EDNS0 options, and to fall back to TCP if EDNS0 is unsupported. That requires some confidence that EDNS0 support in authority servers is widespread, however.

draft-vixie-dnsext-dns0x20 describes a shorter-term option for introducing additional entropy into queries using UDP transport, with or without EDNS0.

Joe

Why not just require TCP for a lookup if a response with an incorrect TXID is received? You could require TCP for just the one lookup or for some configured interval, say 1 hour. That should slow attackers down substantially.

Joe Abley wrote:

That sounds like a good way for a remote attacker to make a resolver disable UDP transport for a server, more or less at will. I'm not sure I like the sound of that.

Joe

matt@credibleinstitution.org (Matt F) writes:

Why not just require TCP for a lookup if a response with an incorrect
TXID is received? You could require TCP for just the one lookup or for
some configured interval, say 1 hour. That should slow attackers down
substantially.

because TCP is considered optional by many authority DNS server operators.
it's only required if you expect AXFR or if you ever emit a TC bit. if you
don't want to do TCP then you can rule out the TC bit and AXFR and just not
do TCP, and you'll be dead-to-rights within the various DNS protocol RFCs.
anyone who insists on reaching such a server by TCP will be shit-outta-luck.

however, this suggestion and dozens of others are being workshopped all day
every day by actual DNS experts. you may not know about those discussions
because they are not occurring on nanog@, where they would be off-topic,
like this thread here. please join namedroppers@ops.ietf.org and perhaps
dns-operations@lists.oarci.net if you want to discuss DNS protocol matters.

please, please, please don't open this can of, um, worms on nanog@ again.
not even on a sunday afternoon when just about anything goes.

Paul Vixie wrote:

hey are not occurring on nanog@, where they would be off-topic,
like this thread here

you may want to read the aup. by my read they are not off topic.

randy

Paul Vixie wrote:

because TCP is considered optional by many authority DNS server operators.
  

Hey authority DNS server operators. Can you make a change to your servers to always allow TCP client connections? Would this be difficult? What would be the harm?

it's only required if you expect AXFR or if you ever emit a TC bit. if you
don't want to do TCP then you can rule out the TC bit and AXFR and just not
do TCP, and you'll be dead-to-rights within the various DNS protocol RFCs.
  

what RFCs forbid TCP for clients? I thought TCP was an option for clients. I'm not spending the rest of my sunday though reading rfcs....... and sure as hell not joining another list because to tell you the truth, I don't really care as much about the typical angry Sunday list poster (talk about redundant statement....)

thanks for the thoughts, though Paul. I'll leave the rest of this discussion (should it exist) to others in their forum of choice.... I'm thinking of nice insalade caprese with true mozarella di bufalo right now.... now That's A Sunday!"

CP

Randy Bush wrote:

Paul Vixie wrote:
  

hey are not occurring on nanog@, where they would be off-topic,
like this thread here
    
you may want to read the aup. by my read they are not off topic.
  

Also: given how serious the problem is, I'd think that far and wide perspective
on this is appropriate. Knowing IETF, there is usually not nearly enough
operations perspective than you'd like, and the various working group lists
can be pretty daunting for somebody who's day job is to "merely" keep the
net running.

       Mike, delurking

SYN flooding?

Paul Vixie wrote:

because TCP is considered optional by many authority DNS server
operators.

Hey authority DNS server operators. Can you make a change to your
servers to always allow TCP client connections? Would this be
difficult? What would be the harm?

brett@the-watsons.org (brett watson) writes:

SYN flooding?

SYN flooding is a specific instance of "have to hold too much state" whereas
the reason for not considering TCP mandatory is the general form of "have to
hold too much state". also note, the operators of those nameservers aren't
reading nanog@, or indeed any other mailing list where they could all be
reached. the installed base is, as usual, an impediment to righteous change.

It may be worth clarifying that "not considering TCP mandatory" above is an implementation/operational choice, and not something that seems to be clearly endorsed by RFC 1035, such as it is.

There are a lot of people who insist that TCP transport is used for nothing other than zone transfers in the DNS, and they do so not out of concern over potential TCP state explosion on their servers but instead because "that's what the last guy told me". That kind of reasoning doesn't need a bigger posse.

Joe

4.2. Transport

The DNS assumes that messages will be transmitted as datagrams or in a
byte stream carried by a virtual circuit. While virtual circuits can be
used for any DNS activity, datagrams are preferred for queries due to
their lower overhead and better performance. Zone refresh activities
must use virtual circuits because of the need for reliable transfer.

The Internet supports name server access using TCP [RFC-793] on server
port 53 (decimal) as well as datagram access using UDP [RFC-768] on UDP
port 53 (decimal).

(here we are discussing dns protocol details on nanog@ again. must be sunday.)

From: Joe Abley <jabley@ca.afilias.info>

It may be worth clarifying that "not considering TCP mandatory" above is
an implementation/operational choice, and not something that seems to be
clearly endorsed by RFC 1035, such as it is.

There are a lot of people who insist that TCP transport is used for
nothing other than zone transfers in the DNS, and they do so not out of
concern over potential TCP state explosion on their servers but instead
because "that's what the last guy told me". That kind of reasoning
doesn't need a bigger posse.

Joe

4.2. Transport
...

actually, it does (need a bigger posse). a little further on in RFC 1035 we
find this gem:

4629 (3.64 KB)

(here we are discussing dns protocol details on nanog@ again. must be sunday.)

(Or alternatively we could just be discussing DNS operations, something that is entirely on-topic for this list, and conceivably of interest to the many hundreds of people who are subscribed here but not to other dns-specific lists. That was certainly my intent, even if it wasn't yours.)

From: Joe Abley <jabley@ca.afilias.info>

It may be worth clarifying that "not considering TCP mandatory" above is
an implementation/operational choice, and not something that seems to be
clearly endorsed by RFC 1035, such as it is.

There are a lot of people who insist that TCP transport is used for
nothing other than zone transfers in the DNS, and they do so not out of
concern over potential TCP state explosion on their servers but instead
because "that's what the last guy told me". That kind of reasoning
doesn't need a bigger posse.

Joe

4.2. Transport
...

actually, it does (need a bigger posse).

Rhetoric aside, no it doesn't.

Choosing not to implement (or permit, as an operational decision) TCP because of concerns about state is what you go on to talk about; what you were actually replying to was the wholesale denial of 53/tcp out of simple ignorance, which I would be surprised to hear you endorse, even if it happens to coincide on this instance with the results of your analysis.

Joe

> actually, it does (need a bigger posse).

Rhetoric aside, no it doesn't.

Choosing not to implement (or permit, as an operational decision) TCP
because of concerns about state is what you go on to talk about; what you
were actually replying to was the wholesale denial of 53/tcp out of
simple ignorance, which I would be surprised to hear you endorse, even if
it happens to coincide on this instance with the results of your
analysis.

not doing tcp/53 because the last guy didn't do it is the first step toward
not doing tcp/53 because it's amazingly fragile. sorry to cross the streams
without a diagram.

brett watson wrote:

Hey authority DNS server operators. Can you make a change to your servers to always allow TCP client connections? Would this be difficult? What would be the harm?

SYN flooding?

from your clients? We ways of knowing people on our local network are doing this type of thing and turn them off at the switch today. Why are you are doing dns recursion for people outside your network?

CP

The question isn't whether to offer TCP/53 up at the recursive
server. The issue is that for you to use TCP/53 from your recursive
server, it has to be offered up at the authoritative end.

The authoritative server operators have to offer TCP/53 and the
firewall administrators between the recursive server and the
authoritative servers have to allow the traffic.

         -rob

The question isn't whether to offer TCP/53 up at the recursive
server. The issue is that for you to use TCP/53 from your recursive
server, it has to be offered up at the authoritative end.

The authoritative server operators have to offer TCP/53 and the
firewall administrators between the recursive server and the
authoritative servers have to allow the traffic.

         -rob
  

Yes. This is true. But with a caching resolver being used for most interactive clients (web surfers), this doesn't cause any problem, other than the initial caching.

OK I guess the question is this: How many milliseconds now on average does it take for my local dns server to obtain an address which is uncached using recursion up to the authoritative end using UDP
And I guess the second question is: How many milliseconds on average would it take for my local dns server to obtain an address which is uncached using recursion up to the authoritative end using TCP.

Once it is cached on my local caching server, its a non-issue if I am using some sort of persistent connection to my (non-authoritative) dns caching server.

CP

Sorry if this is real stupid for some reason because I don't think about DNS
all day (I'm the ldap dude) but since we have faster networks and faster
cpus today, what would be the harm in switching to use TCP for DNS clients?
The latency on the web isn't dns anymore ever it seems to me.....

Latency on in-addr lookups where you typically traverse multiple
forward trees to find the NS servers would seriously suck. At best, a
TCP-based lookup performs at about a third of the speed of a UDP
lookup. Worse unless your implementation is carefully optimized and
you make sure that the OS isn't adding options to the front of the
handshake. You have at least the whole syn/synack/ack handshake before
you can even ask the question.

Then there's the server cost associated with keeping that much state...

Why not just require TCP for a lookup if a response with an incorrect TXID
is received? You could require TCP for just the one lookup or for some
configured interval, say 1 hour. That should slow attackers down
substantially.

Because the attacker is using a sequence of lookups in order to hit
one that lets him poison the cache. That is, he looks up a.google.com,
then he looks up b.google.com, then c.google.com, etc. until he gets
one where the server accepts his fake DNS server record for
google.com.

To be an effective defense, you'd have to do TCP lookups for the whole
scope ({anything}.google.com) for some period of time following the
bad ID. That in turn would open up a potential DOS where an attacker
could force the DNS server to fall back on TCP for essentially
everything, overwhelming it.

Hey authority DNS server operators. Can you make a change to your servers
to always allow TCP client connections? Would this be difficult? What would
be the harm?

SYN flooding?

SYN flooding is a solved problem.

TCP works pretty well with anycast too, if you're careful. It's helpful if
your transactions are short-lived.

Define "careful." It's always possible for someone to find themselves
with an equal cost path to two different servers in the anycast set.
Add per-packet load balancing at the fork (which is outside the
control of the server operator) and what happens is the request times
out and the resovler fails over to the other NS record that isn't
anycasted.

Though the protocol is simple enough that it might be possible to fake
it. Build yourself a DNS-only stateless TCP stack for the anycasted
address. Have the server send the syn-ack without creating any state.
The request will almost certainly be entirely contained in one packet,
so when it arrives reply to it without creating any state. Ship off as
many packets as you need to reply followed by a Fin. Blindly ack any
packet that looks like it needs it. If any packets are lost, there
won't be any retransmit (you haven't really established a TCP
connection) so the query will time out and retry.

Regards,
Bill Herrin