Slashdot: Providers Ignoring DNS TTL? (fwd)

BTW, while it looks like you've shown it to be traditional load balancing,
I ought to explain that this is also not a very good idea. The
loadbalancer is a single point of failure, usually. Loadbalancers are a
good idea for stateful, high-work-request servers such as web servers
running web-apps. This allows you to apply many servers to what then
appears to be a single service. This is well worth the expense of a single
point-of-failure, since what you really wanted was a single, bigger and
exponentially more expensive server. The multiple small cheap servers
with the load balancer give you that view. When the load balancer detects
failure and drops that failed server, the loadbalancer isn't really
offering "higher availability" than a single server, but is rather
compensating for the fact that multiple small cheap servers will
collectively crash more frequently.

However, DNS service is comparatively low-work-request, and low latency.
Generally, people are seeking high availability and load distribution for
DNS caches. But the work of the traditional load balancer is probably
comparable to the work of the DNS server. So the benefits of multiple DNS
servers behind a single load balancer are probably negligible.

    --Dean

This can also be done with stateless hash-based load balancing, which produces exactly the results discussed below (single TCP sessions remain on the same server, while repeated UDP queries go to different servers). A single address is advertised by the DNS servers via OSPF. Each POP has multiple servers advertising that address from the same location. Per-session load balancing distributes the requests evenly across the several servers without the complexity of a stateful load balancer. The hashing ensures that TCP sessions remain on the same server - as long as there isn't a change in the state of any of the servers in that pool.

In theory, a state change of any one of the servers could cause TCP sessions in that pool to reset. In practice, this hasn't been a problem and the benefits from the DDoS robustness of this solution have been substantial. I'll take the hit of very rare TCP DNS resets if it means I avoid outages of the DNS service during DDoS attacks.