Memory leak cause of Comcast DNS problems

Perhaps your DNS software also has a memory leak? Anyone know which
software Comcast was using? Should other ISPs be concerned they might
have the same latent problem in their systems?

http://www.twincities.com/mld/twincities/business/11407913.htm

Comcast said the intermittent but often-severe nighttime outages . which
also occurred April 7, Tuesday and Wednesday across the country and in the
St. Paul area . are related to software malfunctions on company computers
responsible for receiving, interpreting and routing subscribers' Web-site
requests. E-mail also was affected in some cases.

A company spokeswoman wouldn't elaborate on the nature of the software
problems, identifying them only as a "memory leak." But she said steps
meant to end them roughly coincided with Thursday's erratic outage, which
may have been less severe than the earlier ones, and added the fixes will
likely avert future service blackouts of this kind.

* Sean Donelan:

Perhaps your DNS software also has a memory leak? Anyone know which
software Comcast was using? Should other ISPs be concerned they might
have the same latent problem in their systems?

Probably yes, especially if they don't read documentation of their DNS
software.

The maximum amount of memory to use for the server's cache, in
bytes. [...] The default is unlimited, meaning that records are
purged from the cache only when their TTLs expire.

The number of complaints I've heard that "DNS resolvers eat *so* much
memory" suggests that few people tweak the default configuration. 8-(

However, it's unlikely that this was the cause of Comcast's problems
because DNS cache overflows would have an impact on a much larger
scale.

That was my first guess too.

Most DNS servers don't even have this switch.

"ps v -C <server-process-name>" will tell you how badly you're hurting

Anybody that does a bunch of lookups -- whether this is forward lookups
for customers or blacklist lookups on mail or whatever -- is probably
using more than they think. I don't know of many directly related crash
scenarios but there are other penalties like shallow caching which aren't
entirely trivial.

The churning strategy employed is the question that doesn't get answered.
Some servers do FIFO, some do random discards, some use swap space... This
whole area is treated like the embarrassing aunt in the cellar.

Hi,

> The maximum amount of memory to use for the server's cache, in
> bytes. [...] The default is unlimited, meaning that records are
> purged from the cache only when their TTLs expire.

That was my first guess too.

Most DNS servers don't even have this switch.

Actually, I suspect most servers now do, at least in the context of Internet service provision. I believe BINDv9 + dnscache + CNS (don't know about maradns, powerdns, or posadis but I believe their relative percentage isn't significant) outnumber BINDv4 and BINDv8. Don't know if Microsoft DNS allows you to limit memory consumption, but I don't think it is used in an ISP context that frequently (although I might be wrong).

Rgds,
-drc