Calculating Jitter

I'm hoping here that this post isn't out of line with the scope of the NANOG list, of which I've been a long time lurker. If so, please just ignore me.

We're trying to calculate Jitter of a variable (non-limited) size data set. One Jitter formula that we see cited occasionally (and is in RFC 1889 - I believe iPerf uses this formula for it's Jitter #'s) looks something like this:

J = J+(|D(i-1,i)|-J)/16

The problem with this formula is that it works best on small sample sets, and it also favors more recent samples. As the sample size grows, the jitter of early samples seem to get factored down to basic "noise", and then aren't really well represented in the overall Jitter number.

We're trying to find a viable formula for showing a general Jitter "average" over a period of time. One possibility here is just to iterate all samples like this:

Jsum = Jsum+|D(i-1,i)|

and then calculating the jitter like this:

J = Jsum / (sample count - 1)

The sample count could be anywhere from 2 to 1 million (or more). This formula does seem to represent early sample in the "Jitter" number just as strongly as later samples, but seems like it might be a bit simplistic.

Does anyone have any feedback on this alternate way of calculating Jitter, or any better ways to do this?

Thanks in advance for any input.

Jeff Murri
Nessoft, LLC
jeff@nessoft.com
www.nessoft.com

I'm hoping here that this post isn't out of line with the scope of the
NANOG list, of which I've been a long time lurker. If so, please just
ignore me.

Hello Jeff;

These are both moving averages, the question is the memory of the moving average.
The RTP version has a specific finite memory, the one you describe has an infinite memory.

The statistical trouble with infinite memory moving average estimates is that they
eventually converge to a fixed value (after one million samples, say, even a
large change in jitter will take a long time to produce a small change in the average), and,
if the underlying process is not stationary, then they need not converge to anything like
the correct current value. The RTCP protocol has a finite memory estimator that has
have enough memory to smooth out statistical fluctuations somewhat, but which
responds to real changes in the underlying jitter fairly rapidly, and which is comparable
across implementations. Yes, this means that older data is ignored (that's the finite memory part),
but its intended use is to try and estimate what's happening in the network now, not last week. (I
have had RTCP sessions up for months; even a solid day of high jitter would hardly budge a
total average over months.)

If what you want is the jitter averaged over some long period of time (say so you can say that
the average jitter on your network was X msec in 2005), then what you want is indeed

Jsum = Jsum+|D(i-1,i)|
J = Jsum / (sample count - 1)

(assuming that the sample count is the number of delay measurements, not the number of delay
differences). Note that that is the same as

J[i] = J[i-1] * (i-1 /i) + |D(i-1,i)| * (1/i)

assuming that the first sample is i = 0 and J[0] is finite; this shows clearly
how new data gets down-weighted as time goes on and i increases.

Regards
Marshall

you saw marshall's comment. If you're interested in a moving average, he's pretty close.

If I understood your question, though, you simply wanted to quantify the jitter in a set of samples. I should think there are two obvious definitions there.

A statistician would look, I should think, at the variance of the set. Reaching for my CRC book of standard math formulae and tables, it defines the variance as the square of the standard deviation of the set, which is to say

  sum of ((x(i) - xmean)^2)

That is one thing I have never understood, if you can pretty much just look at a standard dev and see it is high, and yeah that means your numbers are flopping all over the place, then what good is the square of it? Does it just make graphing better in some way?

Thanks,

Eric

>you saw marshall's comment. If you're interested in a moving average, he's
>pretty close.
>
>If I understood your question, though, you simply wanted to quantify the
>jitter in a set of samples. I should think there are two obvious
>definitions there.
>
>A statistician would look, I should think, at the variance of the set.
>Reaching for my CRC book of standard math formulae and tables, it defines
>the variance as the square of the standard deviation of the set, which is
>to say

That is one thing I have never understood, if you can pretty much just look
at a standard dev and see it is high, and yeah that means your numbers are
flopping all over the place, then what good is the square of it? Does it
just make graphing better in some way?

Hello Eric;

<statistics details>

Because (under some broad assumptions, primarily that the underlying process is stationairy)
estimates of the variance are distributed as a CHI**2 distribution. More
exactly,

summation( (x[i] - mean(x))^2) / true_variance is distributed as CHI**2(N),
which means that as i increases, then

summation( (x[i] - mean(x))^2) / true_variance
is distributed as a normal distribution with a mean of N and a
variance (of the variance estimate) of 2N, so that

V = summation( (x[i] - mean(x))^2) / N is an efficient estimate of the true variance, with
a sigma (of the variance estimate) of sqrt (2 / N) * V

(Since you have to estimate the mean from the same data, you can show that the
estimator is less biased if you use 1 / N - 1 rather than 1 / N in actual calculations.)

Basically, if you want to perform the true rites in the Church of Linear Statistics, you
worry about variances and CHI**2 distributions. If (like most of us in the real world) you
are dealing with non-stationary processes and unknown distributions, you can ignore
this, just calculate the standard deviation, see whether or not things differ
by more than 3 standard deviations, and be done with it.

</statistics details>

Note (from the days when spacecraft had 4 kilobytes of memory) that if you estimate

s[i] = s[i-1] + x[i]
v[i] = v[i-1] + x[i]^2

then the mean estimate at any time is

m[i] = s[i] / i

the total variance is

V[i] = v[i] / i

and the standard deviation for i > 1 is

sigma[i] = sqrt[(v[i] - (i * m[i]^2))/(i-1)]

so, you can do this on the fly without storing all of the data.

Regards
Marshall

Pedantic mode on.

Jitter != Variance

Variance is how spread out a set is (hence the squaring). [it has further
subclasses at sample variance and population variance]

To try to explain it a little more simply than CHI-square distributions
and formulas....
(Mark Twain said, why use a 25 cent word when a nickel one will do... :slight_smile:

The squaring lets you get an indication of the range with added bias
to wildly differing samples so they won't be drowned out by the large
number of samples close to the mean.

Sample variance is the sum of the squares of the differences from the mean
divided by number of samples. It is a numeric value.

But...

Jitter is the measure of deviation from a predetermined constant clock rate.
It can also be the phase deviation of a signal.

You can think of it as clock skew or drift or the uncertainty level in timing.
Jitter is normally expressed as a plus minus range in either peak, mean, or
rms.

They are similar but not the same thing.

Pedantic mode off...

cheers,
--dr