Reducing Usenet Bandwidth

Hi Stephen,

as we all know Usenet traffic is always increasing, a large number of
people take full feeds which on my servers is about 35Mb of continuous
bandwidth in/out. That produces about 300Gb per day of which only a small
fraction ever gets downloaded.

You should be aware that Usenet traffic loads are extremely sensitive to
a number of seemingly inconsequential factors. To mention just two of many:

(1) The presence or absence of individual groups (and by absence, I mean
"poisoning an unwanted group" e.g., dropping any article posted to that
group, *AND* any article that's been *crossposted to* that group). For
example, consider the top 20 groups from (commas added for readability):

!Binary Newsgroups Bytes % Total
! 1 alt.chello.binaries 13,757,677,778 4.538
! 2 alt.binaries.vcd 10,087,744,846 3.327
! 3 alt.binaries.vcd.repost 10,068,663,162 3.321
! 4 alt.binaries.multimedia 9,901,822,387 3.266
! 5 alt.binaries.sounds.mp3 9,159,994,565 3.021
! 6 7,865,347,722 2.594
! 7 alt.binaries.erotica.vcd 7,080,622,563 2.336
! 8 5,381,405,545 1.775
! 9 alt.binaries.movies.divx 5,004,619,468 1.651
!10 4,935,170,128 1.628
!11 alt.binaries.movies.divx.french 4,919,381,694 1.623
!12 alt.binaries.anime 4,672,847,011 1.541
!13 alt.binaries.sounds.mp3.complete_cd 4,448,118,991 1.467
!14 4,410,750,072 1.455
!15 alt.binaries.multimedia.cartoons 3,898,196,934 1.286
!16 alt.binaries.images 3,768,957,616 1.243
!17 3,711,531,880 1.224
!18 alt.binaries.movies 3,547,393,708 1.170
!19 3,219,286,966 1.062
!20 alt.binaries.movies.divx.german 3,194,581,083 1.054

When carriage of a single group can contribute nearly 14GB worth of traffic
to a feed, obviously you should pay attention to what you're carrying. You
can say, "We carry and feed everything" if you like, but remember the fact
that the presence or absence of a -single- group (out of tens or hundreds
of thousands, depending on what you consider to be a valid newsgroup) can
change your feed traffic by 14GB (nearly 5%) a day.

(2) Your choice of maximum per-article article size in octets.

The 80/20 rule holds. You can get 80+% of all articles at a cost of carrying
only about 20% of all octets. To see this, look at slide 33 of\{ppt,pdf\} quoting a graph by
the folks at

A couple of "magic values" that you may want to empirically evaluate for your
local server are in the range of 40-50KB/article (if you run a "text only"-
oriented server), or 250-300KB/article (image-oriented binaries plus text
only server). If you are planning on carrying "everything" be sure you don't
inadvertently cap articles at 1MB/article or even 4MB/article -- you'd still
be missing articles if you choose that low a limit.

a) Internally to a network
If I site multiple peer servers at exchange and peering points then they
all exchange traffic, all inter and intra site circuits are filled to the
above 35Mb level.

Locally, news servers should probably be run on gigabit links; average traffic
may run 35Mbps for a full feed, but you would need additional capacity for
peaking and recovering from outages, to say nothing of loads associated with
feeds you may be fanning out, or local reader traffic loads.

If you buy the argument that news servers should be gigabit connected, then
35Mbps worth of traffic in the local area really isn't much worth worrying

b) Externally such as at public peering exchange points
If theres 100 networks at an exchange point and half exchange a full feed
thats 35x50x2 = 3500Mb of traffic flowing across the exchange peering LAN.

Usenet's pre-arranged and predictable server-to-server flows make an excellent
"foundation load" when it comes to justifying a decision to participate at an
exchange point, and Usenet has always been an important component of exchange
point traffic. Consider, for example, the SIX in Seattle -- it is not a
coincidence that the SIX it is affiliated with Altopia, a Usenet speciality
service provider.

For the peering point question I'm thinking some kind of multicast thing,
internally I've no suggestions other than perhaps only exchanging message
ids between peer servers, hence giving back a partial feed to the local
box's external peers.

In the higher education community, deployment of lightly loaded high
bandwidth networks such as Internet2's Abilene network (see ) has largely eliminated concerns about accomodating
Usenet traffic volumes, at least for Usenet traffic between Internet2-
connected institutions... And of course, many I2 schools peer not only with
each other, but also with local non-I2 Usenet peers, typically via a local
exchange point, thereby "sharing the wealth" assuming you can accept
one server's worth of intermediation.

For those who want to dig in and see for themselves, check out:

Note that for 02/02/02, NNTP traffic (port=119) was the hottest application
on a per-destination-port basis for the aggregation of all I2 network nodes,
running 12.4% of all octets. Of course, if you change to a per-source-port
view, Kazaa/Morpheus/FastTrack traffic (port=1214) was running fully twice
that hot at 25+% of all octets for all network nodes. :-;