China ISPs DNS problems on Jan 22nd - any idea what happened?

This past Tuesday the 22nd I was witness to a widespread DNS poisoning
problem in China, whereby a lot of DNS queries were all returning the same
IP address, 65.49.2.178. Our websites became unavailable for most of our
customers in China, as with many other websites.

The major difficulty that we had with one of our websites is that we use
Akamai's CDN, and Akamai's servers within China were unable to reach our
origin server in London - so it originally appeared as if Akamai was having
an issue before unofficial news surfaced that there was a larger problem
going on.

I have two questions for anyone:
1) I've found quite a bit of unofficial news [1] [2] on what happened, but
does anyone know what *actually* happened? The only official news from the
government that I can find says, "It was probably a cyberattack, but
really, we don't know." [3]
2) As a website & network operator who strives to keep their product always
available, is there anything I can actually do to prevent from this in the
future?

The most fun question I think everyone would love to know is - does anyone
from Hurricane Electric have a throughput graph showing the traffic surge
caused by this? It must be epic.

Cheers,
Patrick

1:
http://gizmodo.com/most-of-chinas-web-traffic-wound-up-at-a-tiny-wyoming-1506486072
2:
http://chinadigitaltimes.net/2014/01/massive-internet-failure-caused-great-firewall/
3: http://news.xinhuanet.com/english/china/2014-01/23/c_133067744.htm

Patrick van Staveren <pvanstaveren@mintel.com> writes:

This past Tuesday the 22nd I was witness to a widespread DNS poisoning
problem in China, whereby a lot of DNS queries were all returning the same
IP address, 65.49.2.178. Our websites became unavailable for most of our
customers in China, as with many other websites.

...

I have two questions for anyone:
1) I've found quite a bit of unofficial news [1] [2] on what happened, but
does anyone know what *actually* happened? The only official news from the
government that I can find says, "It was probably a cyberattack, but
really, we don't know." [3]
2) As a website & network operator who strives to keep their product always
available, is there anything I can actually do to prevent from this in the
future?

I believe the protocol feature specifically designed to prevent this
kind of thing is DNSSEC.

However, it seems like the common explanation now is an operator error
while administrating the Great Firewall. I don't think there's
anything technical you can do about that.

Patrick van Staveren <pvanstaveren@mintel.com> writes:

This past Tuesday the 22nd I was witness to a widespread DNS poisoning
problem in China, whereby a lot of DNS queries were all returning the same
IP address, 65.49.2.178. Our websites became unavailable for most of our
customers in China, as with many other websites.

...

I have two questions for anyone:
1) I've found quite a bit of unofficial news [1] [2] on what happened, but
does anyone know what *actually* happened? The only official news from the
government that I can find says, "It was probably a cyberattack, but
really, we don't know." [3]
2) As a website & network operator who strives to keep their product always
available, is there anything I can actually do to prevent from this in the
future?

I believe the protocol feature specifically designed to prevent this
kind of thing is DNSSEC.

DNSSEC would not have helped.

Without DNSSEC: The cache asked for the IP address of the origin based on the hostname. It was given the incorrect IP address, did an HTTP GET to that address, got an error. The cache does not serve the content to the user.

With DNSSEC: The cache asks for the IP address of the origin based on the hostname. It was given the incorrect IP address and the cache knows it is incorrect, the cache knows it is incorrect and never tries to get the origin content. The cache does not serve the content to the user.

Not sure how DNSSEC solves the problem.

Now, if someone were intentionally trying to impersonate the origin to, for instance, inject malicious content, then DNSSEC will help. But in this case, DNSSEC is not useful other than keeping some poor soul from being DDOS'ed.

However, it seems like the common explanation now is an operator error
while administrating the Great Firewall. I don't think there's
anything technical you can do about that.

Not serve traffic from behind the GFW?

Performance will be worse, but that has to be balanced against a sovereign nation unintentionally (or sometimes intentionally) modifying your traffic.