representativeness of flow data based on samples

There are a few vendors who now provide traffic export from high-speed
interfaces by sampling those interfaces at a particular rate, and
using the sampled packets to populate the per-flow counters, rather
than looking at every packet.

Does anybody here know of recent research with real internet traffic
which compares different sample rates wrt the representativeness of
the resulting flow data?

On Wed Jan 30 23:50:11 2002, Fred True replied:

You might find this related talk useful:

While the Duffield talk mentions packet sampling, it is primarily concerned
with sampling flow records in order to reduce the post-processing overhead
(i.e. it addresses the accuracy of sampling exported netflow records, rather
than the accuracy of netflow records generated using packet sampling).

Here are a few references that address the issue of packet sampling

I don't know of any other published studies. However, I have been involved
in a number of unpublished tests in which sampling was demonstrated to
produce valid results with sufficient accuracy (provided that suitable
sampling rates and aggregation periods are selected).