Does anybody recommend a CDN to work beside GGC and AKAMAI? and if you have
a real life deployment, do you have any figures about how much this
recommended CDN save from the Internet BW? (currently both of GGC and
AKAMAI saves about 40% of our Internet BW)
Frankly, those three are roughly the same size, and the only ones anywhere near that size.
I would think that talking to Netflix about hosting one of their
boxes would be the obvious next step?
isn't that going to wholey depend on your traffic mix/matrix?
Wouldn't it be helpful to look at where your users send/receive
traffic and then figure out the best next addition?
Maybe your best bet isn't another CDN, but better/more/wider peering
with folk 2+ AS hops out from your current next-hop-as set?
I would say that step 1 is to figure out where your traffic is going. Generically saying “CDN” isn’t enough to know what the results are.
Once you’ve determined where the traffic is going/coming from you can start to make educated decisions vs just “CDN” guessing. An enterprise profile looks much different than residential for example.
I recall some companies calling our NOC “under attack” because their software update server went down and the machines failed safe and were all fetching software updates from “the internet” vs the internal caching proxy.
If you have money to spend, there are a few vendors out there from cheap to $$$$ that will help you look at the traffic to make these decisions.
If you don’t have money to spend, look at NFSen/pmacct. You may be able to spin up a low-cost VM at your local cloud provider (e.g.: digital ocean).
Remember to export both your v6 and v4 (ip classic) flows as these can widely differ.
Look for common ASNs or IP ranges.
I’m sure there’s numerous consultants on the list that would also assist you in this process.
Hope this helps.
Simple flows wouldn't necessarily tell you if you're pulling a bunch from a Netflix caching box on your upstream somewhere. You'd think you had a huge amount going to your current upstream because technically you do, but a local cache or peer could alter that significantly. As we've been starting up our IX, we're finding that we can send lists of ASNs and prefixes and the various CDNs will tell us how much traffic they see going to our customers. Combine that with what flows tell you and I think you've got a good approach.
What are some good approaches to determining traffic levels to not only ASNs, but also that ASN's downstream ASNs? You may have ASNs A, B, C, D and E in your flows. Say none of them represent more than 5% of your traffic by themselves. If B, C, D and E all purchase transit from A and you can reasonably peer with A, you actually can move 25% of your traffic over to a peer. Maybe there is no good approach at doing that without a bunch of manual work or paying someone else to do it.
Looking at some stats from one of our customers that is also going through Equinix Chicago, for their average inbound ~37% of traffic was Netflix, Google was 34% and the next highest was Apple at 5%. Note that Akamai had left Chicago Equinix by this point, so they wouldn't be reflected in those numbers. Those percentages are percent of all traffic they send to Equinix. I believe about 2/3s of their total transit went to Equinix when that got turned up. Their total traffic went up once joining the Equinix IX, presumably because they were now bypassing some congestion somewhere.
probably dns and flow gets you some more traction, right?
meaning: "gosh 126.96.36.199/26 is sending us LOTS of traffic... oh:
nslookup 188.8.131.52 == hosta.networkb.netflix.com, ah-ha!"
where ptr records are generated I suppose like:
$ host 184.108.40.206
220.127.116.11.in-addr.arpa domain name pointer
Also, often just port/protocol are helpful enough... you won't know
without looking (at the OP's traffic I mean), which it sounds like
hasn't really been done yet?
Sure. There are a lot of dynamics to consider. It’s fairly easy to look at TCP speeds and retransmissions to determine the link speed involved. I’ve seen many CDNs quickly identify congested or paths without congestion and engage in some adaptive behaviors.
This being said, there is not a single solution to everything. Chris mentioned using DNS, which is a nice method assuming you see all the queries within your traffic cone.
sorry, I meant that you could just look at the reverse dns for some of
the higher traffic sources/destinations... you can ALSO look at your
recursive dns servers to see what folk are looking up 'often'... which
is a third tool to use. (presuming you see all/most/representative-set
of your customers, yes)
For hosts with no (or meaningless) reverse DNS, I've found that browsing to the IP in question via HTTPs will often provide an SSL certificate with lots of useful information.