I thought I saw an article on routergod.com from Dance Patrick regarding anycast DNS......
~oliver
No DDoS or Anonymous attack appears to have been involved.
Now it's CNN
/Jason
And this is bad why?
"many of our customers experienced intermittent service outages"
Must be that new definition of the word "intermittent." The one
roughly synonymous with "total."
-Bill
Yeah. Doubleplusungood.
As someone else nicely pointed out "network problems starting when the anon post said they would, and ending when they said they would stop.... ironic?"
What time did the anon first say there would be GoDaddy problems? Earliest
timestamp I can find is 10:45am (pacific) and the problems had started much
earlier, around 10:15. Curious if you found evidence of any anons claiming
an attack or responsibility for the attack within, say, 5 minutes of it
starting.
Also, the time it stopped wasn't exactly tied to anything the anon said,
other than his vague statements like "it can last one hour or one month"
and "soon u guys can acess". And he said that latter statement at 1:59pm
while the outage ended at 3:45pm.
Summary: 30 minutes late on the start time, and off by well over an hour on
the stop time.
Damian
even a broken clock is right 2x/day?
nostrodamus was eventually right a few times?
'If you're cold, shoot until you get hot, then keep shooting!' - dick vitale
folk like to look for the most complicated/spooky/crazy reason... most
often it's just a simple reason for failure
so far godaddy seems to agree with the 'it was a simple mistake on our
part' (paraphrased, they probably won't say 'simple')
No large flows reported to the affected NSes, tweets were suspicious at best, other anon-ops denied the attack was them, and GoDaddy admitted internal error.
I'm going to take GoDaddy at their word, and give them major kudos for owning up to the mistake - in public.
No large flows reported to the affected NSes, tweets were suspicious at best, other anon-ops denied the attack was them, and GoDaddy admitted internal error.
I'm going to take GoDaddy at their word, and give them major kudos for owning up to the mistake - in public.
That doesn't mean that their description of the internal error fits
what happened. Not to say that there were an attack, just that there
can be more internal failures, including processes, to be accounted
for. Whether they will publish a root-cause analysis/swiss chesse
model/<insert your preferred methodology> or not is up to them, but to
tech-savvy stakeholders I think they are still in debt.
Rubens
The blog says 99.999% uptime, but I'm guessing this "outage" lasted
more them 5.49300000002 minutes and they probably had other issues
during the year.
when patrick is referring to "taking their word for it", he's referring to
a post on outages@ by godaddy's network engineering manager that stated
"bgp, and more details to follow".
i tend to align with patrick's thought. i'm also interested to see the
details, which they are really under no obligation to provide.
Anytime I've seen a real RFO, it takes more than 24 hours to collect data. Sometimes you actually don't know what happened. There's a reason for this comic: http://www.dilbert.com/strips/comic/1999-08-04/ (the reboot cleared the problem).
I've seen many odd behaviors of devices that nobody could explain, including the vendors.. sometimes it takes a few years to understand what happened. I recall a case where 2-3 years after a major outage someone made some minor comment about their architecture and a light came on.
I welcome more information about mistakes/errors that we can all learn from. Sharing that information can be hard or uncomfortable at times, but can help others learn and not make the same mistakes again. I took the recommendation of others and have started to read "Normal Accidents". amazon link: http://tinyurl.com/9dc6x98
The whole multiple-failures problem really makes me concerned about cascading system failures when things go wrong.
- Jared
Well, mostly I'm taking GoDaddy at their word that this was not a DoS attack.
I also believe it was related to BGP, and am happy to get more info. But we are discussing Anonymous vs. Self-inflicted wound here.
when patrick is referring to "taking their word for it", he's referring to a
post on outages@ by godaddy's network engineering manager that stated "bgp,
and more details to follow".
"more" is the operating word here.
i tend to align with patrick's thought. i'm also interested to see the
details, which they are really under no obligation to provide.
They could have said unspecified/yet-to-be-investigated internal
technical failure(s). But when they quoted an specific failure, their
communication(the PR, not the outages@ report) was trying to benefit
from being specific, which is known to gain more trust. That comes
with a responsibility to be precise and opens it to scrutiny by the
technical community.
Rubens
Well, mostly I'm taking GoDaddy at their word that this was not a DoS attack.
I also believe it was related to BGP, and am happy to get more info. But we are discussing Anonymous vs. Self-inflicted wound here.
I'm skeptical, BGPlay (http://bgplay.routeviews.org/) doesn't show any withdrawn routes for any of their prefixes over Sep 9-11. Infact, their BGP operation looks fairly operational during the time from what I can gather.
So, it would be nice to get more info.
- Naveen
a bgp error doesn't HAVE to mean that they withdrew (or even
re-announced!) anything to the outside world, does it?
for instance:
border-router -> internet
redistribute your aggregate networks from statics to Null0 on the
border-router
accept full routes so you can send them to the other borders and
make good decisions at the external edge
border-router -> internal
send default or some version of default via a fitler to internal
datacenter routers/aggregation/distribution devices.
accept from them (maybe) local subnets that are part of your aggregates
now, accidently remove the filter content for the sessions between the
border and internal ... oops, your internal devices bounce with
'corrupted tables' (blown tables)... you still send your aggs steadily
to the interwebs, wee!
-chris
+1
Announcing a prefix doesn't mean that the traffic to those IPs found
within shall ever arrive.