estimating VoIP data traffic size from VoIP signaling traffic size ?

Hi,

is there any statistics on aggregated VoIP signaling
bandwidth and aggregated VoIP data bandwidth? eg. if
we monitored there is 2Mbps(average) traffic on VoIP
signaling protocol ports ( including SIP, H.323,
MGCP), how could we estimate average VoIP data
bandwidth?

Joe

is there any statistics on aggregated VoIP signaling

    > bandwidth and aggregated VoIP data bandwidth? eg. if
    > we monitored there is 2Mbps(average) traffic on VoIP
    > signaling protocol ports ( including SIP, H.323,
    > MGCP), how could we estimate average VoIP data
    > bandwidth?

For any given user population, you could calculate the ratio, but that's
an observation, not a predictive rule. And it would be different for
every population.

On the other hand, if you have access to the signaling traffic, it
_contains_ the statistics about the data traffic. Byte-counts, times,
loss, latency, jitter, and out-of-order delivery. All right there in each
SIP-Bye message. Which, of course, you aren't guaranteed to see, but
there again, you can extrapolate from knowing what portion you are seeing.

                                -Bill

Your signaling traffic will be incredibly low compared to your RTP streams (especially for G.711u). For G.711u if 2Mbps is your peak think somewhere in the range of 10kbps or so (complete SWAG but I hope you get the picture).

Use about 100k per call for 711u and it will help make the numbers nice and round. If you are trying to calculate busy hour search for Erlang on your nearest web confabulator. There are also innumerable spots on the web where you can find typical numbers for various codecs.

Your signaling traffic will be incredibly low compared to your RTP
streams (especially for G.711u). For G.711u if 2Mbps is your peak
think somewhere in the range of 10kbps or so (complete SWAG but I
hope you get the picture).

Use about 100k per call for 711u and it will help make the numbers

  actually around 88.2k.

nice and round. If you are trying to calculate busy hour search for
Erlang on your nearest web confabulator. There are also innumerable
spots on the web where you can find typical numbers for various codecs.

  I'd like to suggest to people looking at these things to
insure that your stats toolset includes both bps and pps in the
polls. SYNs are low bps rate, but a high pps rate of them can mean
something bad.. it's useful to know how fast these counters are
incrementing.

  - jared

[...]

Use about 100k per call for 711u and it will help make the numbers
nice and round.

It's something like about 80kb/s (each way) for a single G.711 call
over IAX2. Extra channels will take 80kb/s if you're not using
trunking, or 64kb/s if you are. It's probably best to not assume
trunking is in use when doing your calculations.

Hi,

is there any statistics on aggregated VoIP signaling
bandwidth and aggregated VoIP data bandwidth? eg. if
we monitored there is 2Mbps(average) traffic on VoIP
signaling protocol ports ( including SIP, H.323,
MGCP), how could we estimate average VoIP data
bandwidth?

Joe

As mentioned in prior responses to this thread: there are several ways to guess, but mostly the answer is "No, not easily." The good news is that excepting proprietary protocols like Skype and efficient trunking protocols like IAX2, RTP is standardized. This means one VoIP protocol is pretty similar to the other as far as RTP size goes, so at least that part of the equation isn't open-ended. (I'll assume you're looking for end-user statistics, and not inter-nodal statistics where some type of aggregated IP header compression or trunking might make flows more IP-header-friendly.)

Looking just at protocols that use RTP, it's still not quite possible to map RTP volume simply from signalling volume without opening up the signalling to see what codec is being used. If you have a mix of codecs, then your bitstreams for the RTP can range from (typically) ~24kbps for G.729 up to ~80kbps for G.711 (1). Each call can be different, depending on the ability of the originating and terminating gateway/useragent to accept or prefer each codec during the call set-up. You'd need a clear understanding of what codecs your user community was utilizing in order to build an assumption table on number of streams using each codec and/or protocol.

The media stats in SIP BYE signalling Bill Woodcock mentioned in his message (jitter, packets, loss, latency, etc.) are only available in a few end devices at the moment, notably Cisco. The RTCP XR (RFC3611) standards might be visible in signalling soon via SIP NOTIFY messages (2), but I don't know of any equipment that supports this right now.

I think the best way to do this would be to graph the signalling volumes and the media volumes over a week or two, and then build assumption charts for future use. It may not be a big win if the effort to measure signalling is the same as the effort to measure the media, since you have to sample at a point(s) where all this data crosses your measurement instrumentation. If you're really a masochist, or you can't see the media for architecture reasons, you could write an extension to tethereal or ettercap or a similar network monitoring and packet analysis tool which unfolded each signalling message, extracted the codec descriptors, and calculated flows. You'd then have to keep state on each call, etc. etc. etc. - not simple, but not impossible. Lastly, I'm betting there are some signalling analysis tools on the market that already do this, but I would expect that they will not be cheap.

If you're looking at traffic generated by Skype or other closed-protocol system, you're really hanging out in the wind but I'm sure that can be averaged and extrapolated if you have access to a number of media streams from your user population to examine. (Does Skype use extensively variable bitrates depending on endpoint capabilities?)

JT

(1) VoIP bandwidth calculator - Free VoIP tools from Westbay Engineers (note: RTP for SIP and H323 is identical)
     voxgratia.org
     Take all media flow estimates with a grain of salt; typically
     numbers are higher than reported, like G.711 being just shy of
     90kbps instead of 80kbps as noted in most charts.

(2) http://www.ietf.org/internet-drafts/draft-johnston-sipping-rtcp-summary-08.txt

The media stats in SIP BYE signalling Bill Woodcock mentioned in his

    > message (jitter, packets, loss, latency, etc.) are only available in a few
    > end devices at the moment, notably Cisco.

Ah, that's right, I'd forgotten that, sorry. On INOC-DBA we have a
preponderance of Cisco phones, so one end or the other (all that's
necessary) of most calls is a Cisco, and I tend to not worry too much
about whether the remainder are a statistically similar subset of the
total.

    > I think the best way to do this would be to graph the signalling volumes
    > and the media volumes over a week or two, and then build assumption charts
    > for future use.

Agreed.

    > If you're really a masochist, or you can't see the
    > media for architecture reasons, you could write an extension to ethereal
    > or ettercap or a similar network monitoring and packet analysis tool which
    > unfolded each signalling message, extracted the codec descriptors, and
    > calculated flows. You'd then have to keep state on each call, etc.

It's not quite that bad... You don't need to keep state, you just need to
know how much signalling is associated with each call, on average. If you
know the average amount of signalling per call for your traffic mix (which
can be calculated from a baseline analysis of the signalling alone), the
total amount of signalling, and the ratio of codecs in use (which can be
a sample rather than a full count), you should be able to get a pretty
accurate estimate, without ever tracking on a per-call basis.

                                -Bill

Use about 100k per call for 711u and it will help make the numbers

    actually around 88.2k.

I did say round you know! Lol... Things like VAD would drop the numbers even lower for G.711.

nice and round.

If you are trying to calculate busy hour search for
Erlang on your nearest web confabulator. There are also innumerable
spots on the web where you can find typical numbers for various codecs.

    I'd like to suggest to people looking at these things to
insure that your stats toolset includes both bps and pps in the
polls. SYNs are low bps rate, but a high pps rate of them can mean
something bad.. it's useful to know how fast these counters are
incrementing.

Agree with you on that one... PPS is commonly overlooked. Folks are so interested in those bits per second they sometimes miss the obvious. From the VoIP perspective it would also be helpful to keep track of your signaling to RTP traffic ratio. If you find that your signaling demands have increased dramatically compared to your RTP traffic you may have "other" problems to worry about besides maintaining link capacity. If you find that your RTP traffic has suddenly increased it is certainly a potential worry factor as well.

I just finished dealing with a mysterious sudden growth in RTP traffic. The bad part was that it caused initial happiness (the money dance) followed by "oh crap we have a bug" a couple days later. Incomplete signaling can bite you pretty hard when DSPs don't know they are supposed to hang up.

Media traffic volumes are generally not visible, because they're from
endpoint to endpoint, so unless you've got really detailed monitoring
(which the original poster said they didn't), you're not going to see
traffic between two phones in the same building, or traffic between
buildings that don't have the call manager in them. Obviously the
measurement problems are different for ISPs, enterprises with IP-PBXs,
and VOIP companies.

Also, the amount of media volume not only depends on the codec, but
also on the length of the call, while the signalling volume mainly
depends on the number of calls. So if your customers are averaging 3
minute calls, that's a much different ratio than if they're doing
10-second credit card validation calls or one-minute voicemail pickups
or 60-minute teleconferences. If you had enough measurement
capability to estimate this, you could use that directly instead of
guessing from signalling traffic, but otherwise you're just guessing.