We seem to be having some problems with our tata links - first seen in EU
about 45 minutes ago, now we're seeing problems in NA. I'm focused on DNS,
so I'm seeing a lot of timeouts/servfails, but our networking folks are
talking about links dropping.
Anyone else seeing oddness on the NA Internet right now?
Take a look at the size of the updates.20111107.1400.bz2 file and the 1415 file. They are abnormally large compared to a normal period of time. This shows there were a lot of updates out there being processed and a reference to levels of instability.
If you are not feeding route views or similar community projects, please consider doing so. It helps paint the view for those doing analysis.
I'm struggling to do the same. All the various "Internet Health" sites show(ed) some upticks in negative performance but I don't have any specifics. We are a Gomez customer and Gomez is showing issues In St. Louis (SAVVIS) and Philly (L3) that specifically impacts the availability of our applications but it's not clear on the underlying reason. I'm giving cautious updates to management because even though it's obvious something is going on I don't have anything official except random email threads. Looking for more insight before misinforming management.
I think Jared's suggestion was about as close as your going to get for
right now. Look at the size of the files he mentioned as compared to the
average size of the others.
Hopefully someone will come forth with an authoritative answer later
today.
Richard Golodner
One can do some analysis of the files to determine what prefixes and autonomous
system neighbors were impacted.
I can do some of this as I have some other tools that quickly process this data
if people are interested. Please send those replies/votes off list to me directly.
So the file size was 30% higher implies that the number of updates is larger and therefore there is instability? I see the logic but if you scroll thru that page (the whole month of November) there are tons of >1M files. Trying to see what is different about today....
This is an easy benchmark to gauge overall stability. Large files mean something was unstable. Then you need to actually look at them to see *why*. Also since the files are compressed you lose some visibility into what is really in them.
Thank you. This is somewhat of a learning opportunity for me. I hit all the generic Internet health sites and I understand that there IS an issue. Now I'm getting to learn how you guys attempt to understand WHY we had an issue.
But my point is the same. If this is the case than the entire month of November reflects "instability" where I see transitions from 600k to 1M between updates. Yet we didn't experience the same negative customer experience for those. So how do you see the difference with todays events? Digging into files now.
Can anyone point to any authoritative updates about this?
I think Jared's suggestion was about as close as your going to get for
right now. Look at the size of the files he mentioned as compared to the
average size of the others.
Hopefully someone will come forth with an authoritative answer later
today.
Richard Golodner
One can do some analysis of the files to determine what prefixes and autonomous
system neighbors were impacted.
I can do some of this as I have some other tools that quickly process this data
if people are interested. Please send those replies/votes off list to me directly.
according to my peakflow the level-3 update spike was from ~1408 utc to
~1424 utc.