Contention/Oversubscription maths

Hi All,

Do any of you have any pointers on how to go about predicting usage for high-speed ethernet access?

I'm running 1GE links into buildings, and hanging many (100-1000) 100M customers off switches in the basement, simple enough.

I'm assuming ~300Kbps average peak usage per customer, but I can't quite work out at what point that number becomes more important than the fact that each user can peak at 100mbit (and that the backhaul is only access-speed * 10).

I'm well versed in the economies of scale of fitting 10,000 8mbit customers into 1GE, but this seems altogether a different beast to predict. If any of you have similar scenarios I'd be very interested to hear on/off list

Finally, what do people think of selling a 1G service with 1G backhaul (and potentially 10s or 100s of customers buying this service alongside n*100s of customers with 100M service)?

Thanks,

Depends what weasel words you put in your SLA, I suppose. Can they
claim SLA credits for it being down/unusable if they're only getting 1% of
the throughput they paid for? And are they a captive audience or not?

What do you do on Patch Tuesday?

From the perspective of an enterprise customer, if we're talking

strictly Internet circuits, you're over-subscription estimates seem
very conservative to me. On our 100mb/s Internet circuits, our
average utilization is about 40mb/s down and 15-20mb/s up on any given
day.

David.

For that matter, what do you do when the latest 'cool' YouTube video go viral, or Amazon offer the next Lady GaGa album on sale for \$0.99, or people with iDevices download the latest 300MB+ FPS for their devices (there are several of these available now, and they're quite popular)?

This .pdf preso may be relevant:

I'm talking of 1000 users on the end of a 1GE, not 50,000. I don't think either of these scenarios are worrying.

300MB takes <3seconds on 1GE or 30 seconds on 100M. I don't think those kinds of events will have an appreciable effect on the platform here. An album is what, 100MB? A 5min HD youtube video is going to be a similar size. Also too small to care about. These kinds of things don't get worse in high speed, they get easier.

Same with patch tuesday, I think even a Win7 SP1-like release wouldn't cause major headaches, as streaming SD TV for an hour is more bandwidth than SP1 (nevermind HD).

I'm more interested in the levels of traffic that we will see consistently.

I'm talking of 1000 users on the end of a 1GE, not 50,000. I don't think
either of these scenarios are worrying.

300MB takes <3seconds on 1GE or 30 seconds on 100M.

The pont is that it takes a lot longer than 3 seconds if that uplink is already
90% full - it just jumped to 30 seconds. How long does it take for all 1000 users
to do it, assuming the pipe is 95% full already?

A 5min HD youtube video is going to be a similar size.

What's the sound of 1000 Netfllix users streaming 1000 different 90 minute
long HD movies? Why should they limit themselves to 5 minute videos? After
all, you sold them *gigabit*, which we know is 125 times as fast as the 8mit
you're likely to get from that cable company (and that's when you're going to
need those weasel words in your SLA when you can't deliver gigabit).

(You never said if this is a commercial or residential buildout - the traffic patterns,
concerns, and complaints will differ for the two. Good luck in any case)

Finally, what do people think of selling a 1G service with 1G backhaul
(and potentially 10s or 100s of customers buying this service alongside
n*100s of customers with 100M service)?

Depends what weasel words you put in your SLA, I suppose. Can they
claim SLA credits for it being down/unusable if they're only getting 1% of
the throughput they paid for? And are they a captive audience or not?

No SLA, residential customers.

1% is quite unlikely to happen, it would require every customer in a very large deployment to be trying to download >1mbit/sec at the same time.

Peak average usage for most UK broadband installations seem to be between 20 and 100Kb/sec. I'm working on the basis of 300Kb/sec because our access speeds are much higher.

(this would suggest 300Mbit/sec peak for 1000 users, but my question is whether we'll see that level of aggregation at 1000 * 100Mb on 1GE, or whether it's going to be very 'peaky', and we'll only see that smoothness when we aggregate 10-15 buildings onto 10GE)

What do you do on Patch Tuesday?

Update windows.

I would watch out for the 'abusers' in this case, and have the capability to rate-limit the ports if necessary. Some hardware doesn't deal well with 'small' buckets of rate-limiting, eg: taking a 1G port to 1M.

I'm interested in your operational results, as i've had a few general ASSumptions i've offered in this space:

- Most people are going to be limited by their wireless gear (few people care to run wired very far)
- Most servers on the far-end aren't going to be fast enough to cope
- Most people aren't going to spend time debugging network problems
- Some percentage of users are going to run a torrent or something else and flatline their port. You need to be able to police them at a reasonable bandwidth cap, eg: 10M if they have 1G, but that's 1% and many dumber switches won't go under 10%.

The other solution is to just throw more bandwidth at the problem. Make sure your switches can easily take a 10G or n*10G uplink.

I'd expect that the single bursty user is going to drown out the rest of them into the noise, as they will likely plug in directly and have no wireless hops that limit their speeds.

One of my friends operates a WISP in this area and has periodic issues with the heavy usage users and wonders what they're doing with that 150GB of data they "download" in a month.

- Jared

Most useful response so far, thanks very much

No SLA, residential customers.

I would watch out for the 'abusers' in this case, and have the capability to rate-limit the ports if necessary. Some hardware doesn't deal well with 'small' buckets of rate-limiting, eg: taking a 1G port to 1M.

I'm interested in your operational results, as i've had a few general ASSumptions i've offered in this space:

- Most people are going to be limited by their wireless gear (few people care to run wired very far)

This results in people complaining *about* us, as the crappy performance of 802.11g made a lot of people complain about ADSL2+ ISPs (because G can't really do 24Mbit!)

- Most servers on the far-end aren't going to be fast enough to cope

This holds true until you look at services delivered by farms, such as Akamai, Usenet or P2P. I'd be interested to see what happens with 1000 people with 1000Mbit services start torrenting the next major HD movie release.

- Most people aren't going to spend time debugging network problems

- Some percentage of users are going to run a torrent or something else and flatline their port. You need to be able to police them at a reasonable bandwidth cap, eg: 10M if they have 1G, but that's 1% and many dumber switches won't go under 10%.

Policing isn't something we want to do (other than to police 100Mbit ports to 20Mbit for those who buy the 20Mbit product. Policing is against the ethos of the company.)

The other solution is to just throw more bandwidth at the problem. Make sure your switches can easily take a 10G or n*10G uplink.

I'd expect that the single bursty user is going to drown out the rest of them into the noise, as they will likely plug in directly and have no wireless hops that limit their speeds.

This what i see elsewhere in buildings with similar mixes of customers as we're expecting. For example in a building with ~1000 2-100Mbit customers a single 100mbit customer can pretty much double the peak traffic usage of a building if they decide to download a lot.

One of my friends operates a WISP in this area and has periodic issues with the heavy usage users and wonders what they're doing with that 150GB of data they "download" in a month.

It's people like *me* that I'm worried about. Our deployment model means that individual customers can't chose us, we chose which buildings we connect and then are likely to get most people within that building, so I'm hoping our hoggers should be fairly spread out. Though, I'm sure this won't be the case as certain areas of the city are likely to have more of each specific kinds of network user.

Our upgrade plans include 10GE to buildings when the building has a large number of occupants (>1000), but before then it's cost prohibitive.

You're planning to engage in Statistical Multiplexing, or what I've always
termed "bandwidth surfing": how hard can I oversubscribe my uplink without
pissing off the paying customers?

As others have suggested, it depends on quite a number of things, primarily:
whether you're offering an SLA to the customers or not. Whether you have
a caching proxy or not and which CDNs you solicit to provision your node
are also big factors, of course.

In the final analysis, though, it depends on the customer class.

Residence customers will tolerate a lot more oversubscription than business,
enterprise, and server going on down the list of oversubscription, but
happily *up* the list of "how much can I charge". Remember that QoS and
load shaping don't work all that will across the internet at large,
but they work pretty decently inside a single switch; you can prioritize
customers who are willing to pay extra for it. The problem is similar
to airline bookings; it is possible, in the immortal words of Dave Barry,
to envision a situation -- this will happen in your lifetime -- where
no two customers pay exactly the same price.

Cheers,
-- jra

We're hoping we don't have to do any QoS or shaping, as we don't want any links to be saturated. From the data i've gotten it seems that is possible with many hundreds of 100Mbit customers on a 1000Mbit backhaul.

The 1G product is the exception to that, and it's a little unclear where that sits, probably in a box labelled "here be failure".

I think you better have a PPoE concentrator somewhere or you're going
to deal with a flood of broadcast, virus and other trash traffic. I
also think expecting fewer than 1% of your residential customers to
P2P at once is borrowing trouble.

I've been out of the ISP game too long to give you hard numbers at
today's network speeds, but back when I was in the game, a 100:1
oversubscription ratio for residential DSL was around the boundary of
what customers described as poor quality and slow. That number was
steadily trending downward, not up, though the official villain then
was Bit Torrent instead of Netflix.

On the other hand, it might be an interesting experiment to take a
utility company approach... Sure we'll give you a 200 amp service with
only 1000 amps in the neighborhood, but you don't pay for the 200 amp
service you pay for the consumption of kilowatt hours. On the other
hand, heat pumps don't get hacked and start drawing the full 200 amp
service.

Regards,
Bill Herrin

Wow, that works out be a per-connect max of 785 kbps.

Frank