DNS Anycast as traffic optimizer?

I'm sure there is research out there, but I can't find it, so does anyone know of any research showing how good/bad using DNS anycast is as a kludgey traffic optimiser?
(i.e. having multiple datacenters, all anycasting the authoritative name server for a domain, but each datacenters' DNS server resolving the domain name to an IP local to that datacenter, under the assumption that if the end user hit that DNS server first, there is "some" relationship between that datacenter and good performance for that user.)

THe question is, what is that "some" relationship? 80% as good as Akamai? Terrible?
TIA

I'm sure there is research out there...

Why? :slight_smile:

    > ...how good/bad using DNS anycast is as a kludgey traffic optimiser?

I'd hardly call it a kludge. It's been standard best-practice for over a
decade.

    > THe question is, what is that "some" relationship? 80% as good as
    > Akamai? Terrible?

Should be much higher than Akamai, since that's not what they're
optimizing for. If you want nearest server, anycast will give you that
essentially 100% of the time. Akamai tries to get queries to servers that
have enough available capacity to handle the load. Since they're handling
bursty, high-bandwidth applications, rather than DNS.

                                -Bill

Bill Woodcock wrote:

   > I'm sure there is research out there...

Why? :slight_smile:

Usual - if I build it myself, will it work well enough, or should I pony up for a CDN?

   > ...how good/bad using DNS anycast is as a kludgey traffic optimiser?

I'd hardly call it a kludge. It's been standard best-practice for over a
decade.

I thought it was standard best practice for availability, like for root name servers. I thought it was not a good "closest server" selection mechanism, as you'll be going to the closest server as determined by BGP - which may have little relationship to the server with lowest RTT.
It'd be nice to see some metrics wither way....

(Caution: Chris is a chemical engineer, not an anycast engineer)

Bill Woodcock wrote:

> > ...how good/bad using DNS anycast is as a kludgey traffic optimiser?
>
>I'd hardly call it a kludge. It's been standard best-practice for over a
>decade.
>

If I read your original request correctly you were planning on:
1) having presence in multiple datacenters (assume multiple providers as
well)
2) having a 'authoritative' DNS server in each facility (or 2/3/4
whatever per center)
3) return datacenter-1-host-1 from datacenter-1-authserver-1,
datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.

This isn't really 'anycast' so much as 'different A records depending on
server which was asked'

So, you'd be dependent on:
1) order of DNS requests made to AUTH NS servers for your domain/host
2) speed of network(s) between requestor and responder
3) effects of using caching DNS servers along the route

You are not, now, making your decision on 'network closeness' so much as
'application swiftness'. I suspect you'd really also introduce some major
troubleshooting headaches with this setup, not just for you, but for your
users as well.

I think in the end you probably want to obtain PI space from ARIN and use
that as the 'home' for your DNS and Application servers, or atleast the
application servers. There was some mention, and research I believe(?),
about the value of having a partial Anycast deployment, so 3/4ths of your
capacity on Anycast servers and 1/4th on 'normal' hosts to guard against
route flaps and dampening of prefixes...

I'm sure that some of the existing anycast users could provide much mode
relevant real-world experiences though.

-chris

I can give you one data point: VeriSign anycasts j.root-servers.net
from all the same locations (minus one) where the com/net
authoritative servers (i.e., *.gtld-servers.net) are located. An
informal examination of query rates among all the J root instances
(traffic distribution via BGP) vs. query rates among all the com/net
servers (traffic distribution via iterative resolver algorithms, which
means round trip time in the case of BIND and Microsoft) shows much
more even distribution when the iterative resolvers get to pick
vs. BGP. Note that we're not using the no-export community, so all J
root routes are global. When examining queries per second, there is a
factor of ten separating the busiest J root instance from the least
busy, whereas for com/net it's more like a factor of 2.5. Of course,
I'm sure a lot of that has to do with server placement, especially in
the BGP case.

For what it's worth,

Matt

Christopher L. Morrow wrote:

If I read your original request correctly you were planning on:
1) having presence in multiple datacenters (assume multiple providers as
well)
2) having a 'authoritative' DNS server in each facility (or 2/3/4
whatever per center)
3) return datacenter-1-host-1 from datacenter-1-authserver-1,
datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.

This isn't really 'anycast' so much as 'different A records depending on
server which was asked'

Well, there'd be one NS record returned for the zone in question. That NS record would be an IP address that is anycasted from all the datacenters.
So end users (or their DNS servers) would all query the same IP address as the NS for that zone, but would end up at different datacenters depending on the whims of the anycasted BGP space.

Once they reached a name server, then yes, it changes to 'different A records depending on server which was asked'

So, you'd be dependent on:
1) order of DNS requests made to AUTH NS servers for your domain/host

As there'd only be one NS server address returned, that negates this point.

2) speed of network(s) between requestor and responder

Or the closenes (in a BGP sense) b/w the requester and the anycasted DNS server.

3) effects of using caching DNS servers along the route

True. But I'm not trying to cope with instantly changing dynamic conditions.

I suspect you'd really also introduce some major
troubleshooting headaches with this setup, not just for you, but for your
users as well.

I don't doubt that. :slight_smile:

Christopher L. Morrow wrote:

>If I read your original request correctly you were planning on:
>1) having presence in multiple datacenters (assume multiple providers as
>well)
>2) having a 'authoritative' DNS server in each facility (or 2/3/4
>whatever per center)
>3) return datacenter-1-host-1 from datacenter-1-authserver-1,
>datacenter-2-host-2 from datacenter-2-authserver-1, and so forth.
>
>This isn't really 'anycast' so much as 'different A records depending on
>server which was asked'
>
>
Well, there'd be one NS record returned for the zone in question. That
NS record would be an IP address that is anycasted from all the datacenters.
So end users (or their DNS servers) would all query the same IP address
as the NS for that zone, but would end up at different datacenters
depending on the whims of the anycasted BGP space.

Hmm, why not anycast the service/application ips? Having inconsistent DNS
info seems like a problem waiting to bite your behind.

> I suspect you'd really also introduce some major
>troubleshooting headaches with this setup, not just for you, but for your
>users as well.
>
>
I don't doubt that. :slight_smile:

which I'd think you'd want to minimize as much as possible, right?

> This isn't really 'anycast' so much as 'different A records depending on
> server which was asked'

right.

Well, there'd be one NS record returned for the zone in question. That
NS record would be an IP address that is anycasted from all the
datacenters. So end users (or their DNS servers) would all query the
same IP address as the NS for that zone, but would end up at different
datacenters depending on the whims of the anycasted BGP space.

that's generic dns anycast. it's safe if your routing team is very strong.

Once they reached a name server, then yes, it changes to 'different A
records depending on server which was asked'

that's incoherent dns. when i first began castigating people in public
for this, i coined the term "stupid dns tricks" to describe this behaviour.
cisco now has products that will do this for you. many web hosting companies
offer this incoherence as though it were some kind of feature. akamai at
one time depended on it, speedera at one time did not, i don't know what's
happening currently, perhaps they've flipflopped.

dns is not a redirection service, and incoherence is bad. when you make a
query you're asking for a mapping of <name,class,type,time> to an rrset.
offering back a different rrset based on criteria like source ip address,
bgp path length, ping rtt, or the phase of the moon, is a protocol violation,
and you shouldn't do it. the only way to make this not be a protocol
violation is to use zero TTL's to prohibit caching/reuse, which is also bad
but for a different reason.

> I suspect you'd really also introduce some major troubleshooting
> headaches with this setup, not just for you, but for your users as
> well.

I don't doubt that. :slight_smile:

not only is it bad dns, it's bad web service. the fact that a current
routing table gives a client's query to a particular anycasted DNS server
does not mean that the web services mirror co-located with that DNS server
is the one that would give you the best performance. for one thing, the
client's dns forwarding/caching resolver might have a different position in
the connectivity graph than the web client. for another thing, as-path
length doesn't tell you anything about current congestion or bandwidth --
BGP is not IGRP (and thank goodness!).

if you want a web client to get its web data from the best possible web
services host/mirror out of a distributed cluster, then you will have to
do something a hell of a lot smarter than incoherent dns. there are open
source packages to help you do this. they involve sending back an HTTP
redirect to clients who would be best served by some other member of the
distributed mirror cluster.

Which begs the question.. is anyone doing this right now? I've been wondering
about the potential issues wrt anycasting tcp applications.. TCP sessions would
be affected negatively during a route change..

-J

short-lived tcp is probably ok though (like static webpages or something
of that sort) you'll also have to watch out for maintaining
state for distributed application servers (I suppose).

TCP anycast has many more complicated implications than UDP/DNS things, or
so it seems to my untrained/educated eye.

Just to clarify this slightly, since I've known people to misinterpret this point: a clear, contextual understanding of the word "nearest" is important in understanding this sentence.

Here's an example: France Telecom was an early supporter of F-root's anycast deployment in Hong Kong. Due to the peering between OpenTransit and F at the HKIX, the nearest F-root server to OT customers in Paris was in Asia, despite the fact that there were other F-root nodes deployed in Europe. Those OT customers were indeed reaching the nearest F-root node, or maybe they weren't, depending on what you understand by the word "near".

Another one: where anycast nodes are deployed within the scope of an IGP, topological nearness does not necessarily indicate best performance (since not all circuits will have the same loading, in general, and maybe a short, congested hop is not as "near" as several uncongested hops).

For F, we don't worry too much about which flavour of "near" we achieve for every potential client: redundancy/diversity/reliability/availability is more important than minimising the time to do a lookup, and the fact that the "near" we achieve in many cases corresponds to what human users expect it to mean is really just a bonus.

However, in the general case it's important to understand what kind of "near" you need, and to deploy accordingly.

Joe

http://www.caida.org/outreach/papers/2002/Distance/

this paper would be somewhat on-topic, as you can infer the performance
characteristics that anycast would have. no direct comparisons made to
akamai,etc but maybe you can infer those as well.

-dre

Paul Vixie wrote:

not only is it bad dns, it's bad web service. the fact that a current
routing table gives a client's query to a particular anycasted DNS server
does not mean that the web services mirror co-located with that DNS server
is the one that would give you the best performance. for one thing, the
client's dns forwarding/caching resolver might have a different position in
the connectivity graph than the web client. for another thing, as-path
length doesn't tell you anything about current congestion or bandwidth --
BGP is not IGRP (and thank goodness!).

I'm aware that web clients are not colocated with the client's name server, and that BGP does not attempt to optimise performance.

However, I suspect that in most cases, the client is close enough to the name server, and the BGP best path is close enough to the best path if it were based on latency, that most clients would be happy with the result most of the time. I'm not aiming for 100%, just Good Enough.

I'd be interested in seeing any data refuting either of those points, but it looks like I may have to do it, see what I find, and go write my own research paper. :slight_smile:

(I have found data that client's name servers are incorrect indicators of RTT b/w 2 web locations and clients 21 % of the time, but not how incorrect...
http://www.ieee-infocom.org/2001/paper/806.pdf)

This is not always a good assumption:
1) dial clients sometimes get their DNS info from their radius profile (I
believe) sometimes that dns server isn't on the same ASN as the dialup
link.
2) many people have hardcoded DNS servers over the years, ones that have
drifted from 'close' to 'far'
3) corporations with multiple exit points and larger internal networks
might have DNS servers that exit in one country but are queried internally
from other countries/states/locations.

I think Paul's partly pointing out that you are using DNS for the wrong
thing here, and partly pointing out that you are going to increase your
troubleshooting overhead/complexity... Users on network X that you expect
to use datacenter Y are really accessing datacenter Z because their dns
cache server is located on network U :frowning:

I'm glad to see Joe/Paul/Bill jump in though... they do know quite a bit
more about the practice of anycasting services on large networks.

I don't know any papers, but I have see real world examples where a well peered network was adjacent to 5 or more anycasted server, 3 in the US, one in Europe, and one in Asia. The network was going to the Asian server, because that router had the lowest Router ID.

Not exactly sure how that makes it "much higher than Akamai", but that's what I've seen.

I'm sure there is research out there...

    >> Why? :slight_smile:
    > Usual - if I build it myself, will it work well enough, or should I pony
    > up for a CDN?

Uh, what about that makes you sure that there's research out there?

    > I thought it was standard best practice for availability, like for root
    > name servers. I thought it was not a good "closest server" selection
    > mechanism, as you'll be going to the closest server as determined by BGP
    > - which may have little relationship to the server with lowest RTT.

And the lowest RTT doesn't necessarily have much to do with what's
closest. If you want lowest RTT, that's what the DNS client already does
for you, so you don't need to do anything at all.

                                -Bill

Hmm, why not anycast the service/application ips? Having

    >> inconsistent DNS info seems like a problem waiting to bite your
    >> behind.
    > Which begs the question.. is anyone doing this right now?

Yes, lots of people. Akamai is the largest provider of services based on
inconsistent DNS that I know of, and they've been doing it for quite a
while. They were by no means a pioneer. Many others before them, they
might just be one you've heard of.

    > I've been wondering about the potential issues wrt anycasting tcp
    > applications. TCP sessions would be affected negatively during a
    > route change.

Yup, which happens about one hundredth as often as TCP sessions being
dropped for other reasons, so it's not worth worrying about. You'll never
measure it, unless your network is already too unstable to carry TCP flows
anyway. This is also ancient history. I and I assume plenty of other
people were doing this with long-lived FTP sessions prior to the advent of
the World Wide Web. This is the objection clever people who don't
actually bother to try it normally come up with, after they've thought
about it for a few (but fewer than, say, ten) minutes.

                                -Bill

Bill Woodcock wrote:

   >>> I'm sure there is research out there...
   >> Why? :slight_smile:
   > Usual - if I build it myself, will it work well enough, or should I pony
   > up for a CDN?

Uh, what about that makes you sure that there's research out there?

Oops, sorry, misread the question. I should have said "I expect there is research..." I was answering why I wanted to know, not why I expect there is research...

   > I thought it was standard best practice for availability, like for root
   > name servers. I thought it was not a good "closest server" selection
   > mechanism, as you'll be going to the closest server as determined by BGP
   > - which may have little relationship to the server with lowest RTT.

And the lowest RTT doesn't necessarily have much to do with what's
closest. If you want lowest RTT, that's what the DNS client already does
for you, so you don't need to do anything at all.

Excellent point, thanks.
So there is no need to anycast the DNS servers and rely on BGP topology for selection.
Instead use bind's behaviour so that each resolving nameserver will be querying the authoritative nameserver that responds the fastest.
If I have inconsistest replies from each authoratitive name server, where each replies with the virtual IP of a cluster colocated with it, I will have reasonably optimised client's nameserver to web farm RTT.
Whether that is good for the client, remains to be seen, but it seems to be all that (most) commercial CDNs do.

That just makes it too easy....

Am I missing something else, or is it really that simple to replicate a simple CDN?

So there is no need to anycast the DNS servers and rely on BGP topology for selection.
Instead use bind's behaviour so that each resolving nameserver will be querying the authoritative nameserver that responds the fastest.

However, note that only BIND does this. djbdns always selects
nameservers randomly and the Windows selection algorithm is somewhat
of a mystery. See http://www.nanog.org/mtg-0310/wessels.html

Duane W.

For anycast within an organisation, it will be as determined by the IGP, not BGP.

regards,