Reducing Usenet Bandwidth

Hi all,
as we all know Usenet traffic is always increasing, a large number of
people take full feeds which on my servers is about 35Mb of continuous
bandwidth in/out. That produces about 300Gb per day of which only a small
fraction ever gets downloaded.

The question is, and apologies if I am behind the times, I'm not an expert
on news... how is it possible to reduce bandwidth used occupied by news:

a) Internally to a network
If I site multiple peer servers at exchange and peering points then they
all exchange traffic, all inter and intra site circuits are filled to the
above 35Mb level.

b) Externally such as at public peering exchange points
If theres 100 networks at an exchange point and half exchange a full feed
thats 35x50x2 = 3500Mb of traffic flowing across the exchange peering LAN.

For the peering point question I'm thinking some kind of multicast thing,
internally I've no suggestions other than perhaps only exchanging message
ids between peer servers, hence giving back a partial feed to the local
box's external peers.

Any thoughts?

TIA

Steve

There were people that did multicast injection of usenet
then it deencapsulated the news and fed it to rnews. What happened
is that a number of people migrated to Cyclone/Typhoon and other
news transport software that did not allow this (easily).

  (most) major providers have multicast avaiable to customers and
internally. The people running the news servers just need to create
a delivery method that allows the articles to be passed around that way
and all will be taken care of.

  The disadvantage is that it would potentially allow spammers
to inject massive amounts of articles and servers would
have to reject them based on some filtering criteria or just
get the multicast access removed for such a customer. I actually don't
see them being that bright so I wouldn't worry too much about that.

  - Jared

Abolish all binary posts / binaries groups. IMO, posting binaries to
usenet is a throw back to the days of low speed networks and few people
having the capability to put up their own FTP/HTTP accessible files.

These days, it's just abused as a free warez network and free porn
hosting/advertising. How else can you send MS Office or 100MB of porn
jpegs or mgegs to thousands of people for basically free[1]?

[1] at least it's free to the poster....not to the networks that keep
having to build ridiculously larger news servers. The first news server I
ran for an ISP had 4GB...and that was for the OS and articles. The most
recent has 18GB for OS, 288GB for articles...and it's obviously obsolete
if others are accepting 300GB/day. When are the operators going to draw
the line and say "no more" to binaries?

[1] at least it's free to the poster....not to the networks that keep
having to build ridiculously larger news servers. The first news server I
ran for an ISP had 4GB...and that was for the OS and articles. The most
recent has 18GB for OS, 288GB for articles...and it's obviously obsolete
if others are accepting 300GB/day. When are the operators going to draw
the line and say "no more" to binaries?

Perhaps the same time the end-users (whose fees go towards paying for the
hardware, presumably) say "we don't want binaries anymore."

-- Alex Rubenstein, AR97, K2AHR, alex@nac.net, latency, Al Reuben --
-- Net Access Corporation, 800-NET-ME-36, http://www.nac.net --

Have you consider the Cidera solution ?
See <http://www.cidera.com/services/usenet_news/index.shtml>
(Basically the idea is satellite broadcast direct to the edge networks)

It would be interesting to survey the customers and see:

A) how many would care if we didn't provide usenet
B) how many would care if we didn't carry any binaries
C) how many have never heard of usenet

I suspect A and B would be small, and C large.

We are a medium sized ISP with about 15,000 customers using connection types from dialup to T3. We have found that at any given time, we have 5-10 users connected to our news server per hour. Assuming the average session is at least 1-2 hours for most binaries users and the same people probably use news each day... .0002% of our users actually use usenet. We have outsourced news for about 2 years to Giganews and we have found it to be a far more effective solution for our needs anyway.

-Robert

Tellurian Networks - The Ultimate Internet Connection
http://www.tellurian.com | 888-TELLURIAN | 973-300-9211
"Good will, like a good name, is got by many actions, and lost by one." - Francis Jeffrey

Actually, C would be huge, but the remainder would *all* want binaries.

The general solution seems to be to keep porn no matter what, keep all text
(a mere drip in the pool anyway), kill "monitor" and other mp3 groups,and
take down wannadoo / microsoft / other huge but useless heirarchies.

*Every* news site I am intimately familiar with needs porn to keep the users
quiet...

Our solution for a long time was Cidera in, but hole-filling for text only
(binaries that were missed were just too bad), but this was a lot of expense
that just didn't pay. Now I outsource the porn :wink:

Quoting Stephen J. Wilcox (steve@opaltelecom.co.uk):

For the peering point question I'm thinking some kind of multicast thing,
internally I've no suggestions other than perhaps only exchanging message
ids between peer servers, hence giving back a partial feed to the local
box's external peers.

Any thoughts?

Ask google about "drinking from the firehose USENET multicast news",
or have a look in ftp.uu.net:/networking/news/muse.

I've no idea what became of this project. I first read the paper
when it came out and I was running/hacking on news servers. I clearly
thought it interesting enough to keep a copy of it.

James

It would not, because in the proposed protocol articles where signed
by the sender site.

The major drawback of that protocol is that it limits articles size to
64KB, so it does not reduce binary traffic, which is the largest part
of a newsfeed.

In which case you just modify the protocol (or roll your own) to have
articles spread across multiple packets. It's not that hard and our guys
did this and I expect it's been done (at least) half a dozen times by
other people.

Anyway multicasting thing doesn't solve the problem really, you still
have to transport 30Mb/s (plus) from outside you network to inside it, put
it onto a (relativly expensive) server and give it to the customers.

The economics of the whole exercise are very interesting once you get past
disccussion ( worth doing for anybody ) and picture groups (worth
doing for all but the smallest ISP). Comitting to a good supply for
multi-part binary groups probally should involve sitting down with one
of the company accountants (if you only have one accountant then your
company is probally to small).

Simon Lyall <simon.lyall@ihug.co.nz> writes:

> The major drawback of that protocol is that it limits articles size to
> 64KB, so it does not reduce binary traffic, which is the largest part
> of a newsfeed.

In which case you just modify the protocol (or roll your own) to have
articles spread across multiple packets. It's not that hard and our guys
did this and I expect it's been done (at least) half a dozen times by
other people.

It was trivial when I did it as a testbed for my previous boss (sorry,
not open-source).

If you compress the article before sending (we used libz, and at the
time a 1 GHz PIII (the fastest machine we could get) kept up with the
28mbps full feed just fine), a truly startling percentage of articles
(I want to say 78%) fit in a single packet on the ethernet... 1500
bytes minus udp encap and non-compressed header data, which includes
stuff like local article sequence number, part number for multipacket
articles, md5 checksum of article, and message-id. Of course, those
little articles are not the ones that are eating your bandwidth, but
even so I found that to be quite interesting.

Our software was nowhere near as complex as the UUnet software (just
point-to-multipoint), and probably quite similar to Cidera's software.
Worked pretty well over satellite too.

Of course, the dirty little not-so-secret is that for all the hacks
that people have done over the years, inter-AS multicast that doesn't
get hosed at the drop of a hat remains an elusive goal(1). Thinking
about doing multicast Usenet feeds as a way to cut down the bandwidth
in and out of your ISP overlooks the fact that reliable transport for
such things doesn't exist. I'm sorry if I've offended anyone with
this (uncharitable) assessment.

                                        ---Rob

(1): http://beaconserver.accessgrid.org:9999/

Thus spake "Robert E. Seastrom" <rs@seastrom.com>

Of course, the dirty little not-so-secret is that for all the hacks
that people have done over the years, inter-AS multicast that doesn't
get hosed at the drop of a hat remains an elusive goal(1).

But if this service works for most of the people most of the time, it's
achieved enough to justify deployment, no?

Thinking about doing multicast Usenet feeds as a way to cut down the
bandwidth in and out of your ISP overlooks the fact that reliable
transport for such things doesn't exist.

Ah, but the point of Muse was not to provide reliable news service, but to
reduce the average propagation time and provide "good enough" delivery which
could be supplanted by a traditional reliable feed.

S

Of course, the dirty little not-so-secret is that for all the hacks
that people have done over the years, inter-AS multicast that doesn't
get hosed at the drop of a hat remains an elusive goal(1). Thinking
about doing multicast Usenet feeds as a way to cut down the bandwidth
in and out of your ISP overlooks the fact that reliable transport for
such things doesn't exist. I'm sorry if I've offended anyone with
this (uncharitable) assessment.

  I don't take any immediate offense but the number of
multicast connected sites is (slowly) going up. Sprint customers
can toggle their (bgp) session from nlri unicast -> unicast multicast
in order to get mbgp routes and enable pim to do forwarding. Obviously
one needs to chat with someone to get msdp going.

  The routers don't tend to have very multicast related bugs anymore,
it's all people who can't configure their routers.

  Of the 100+ sessions that are in sdr/sap these days I can typically
reach (at least) 65%+ of them without any problems.

  The problem is that the "tier 2" and whatnot providers have [mostly]
missed the boat on it and don't have people that are able to configure
their routers for mbgp. (this is not to say all but most).

  There are a lot of resources for getting help in fixing and
deploying multicast these days. It'd be nice to see more people turning it
on.

  - Jared

"Stephen Sprunk" <ssprunk@cisco.com> writes:

> Thinking about doing multicast Usenet feeds as a way to cut down the
> bandwidth in and out of your ISP overlooks the fact that reliable
> transport for such things doesn't exist.

Ah, but the point of Muse was not to provide reliable news service, but to
reduce the average propagation time and provide "good enough" delivery which
could be supplanted by a traditional reliable feed.

I probably shouldn't have used the term "reliable transport" because
I'm talking about propensity for one's network to break (ie, whether
you can count on the transport being available), not reliable data
streams a la TCP. "High maintenance" in the vein of Sally Albright in
"When Harry Met Sally" is probably more appropriate.

                                        ---Rob

Jared Mauch <jared@puck.nether.net> writes:

  The routers don't tend to have very multicast related bugs anymore,
it's all people who can't configure their routers.

  Of the 100+ sessions that are in sdr/sap these days I can typically
reach (at least) 65%+ of them without any problems.

I'll buy that. Better than it was a year and a half or so ago, that's
for sure. But nowhere near acceptable IMHO.

                                        ---Rob

"Stephen J. Wilcox" <steve@opaltelecom.co.uk> writes:

[...]

The question is, and apologies if I am behind the times, I'm not an expert
on news... how is it possible to reduce bandwidth used occupied by news:

We had pretty good luck (modulo some crashing software) with caching
news servers, instead of traditional news feeds. We had a master
cache in the center of our network, and satellite caches on the edge
which connected back to the master.

We found that the vast majority of groups never got read, and the ones
that did were read consistently, so it was possible to prefetch those
groups during off-hours.

We convinced a commercial newsfeed to charge us by bandwidth instead
of simultaneous readers, had pretty good service (more points of
failure, so more downtime, but still pretty good...), and saved a lot
of bandwidth, disk space, engineer time, and money.

If we had time to get the bugs worked out (the software crashed
multiple times per day, then our ISP was purchased), it would have
been a perfect system.

-------ScottG.

steve@opaltelecom.co.uk ("Stephen J. Wilcox") writes:

as we all know Usenet traffic is always increasing, a large number of
people take full feeds which on my servers is about 35Mb of continuous
bandwidth in/out. That produces about 300Gb per day of which only a small
fraction ever gets downloaded.

The question is, and apologies if I am behind the times, I'm not an expert
on news... how is it possible to reduce bandwidth used occupied by news:

Pull it, rather than pushing it. nntpcache is a localized example of how
to only transfer the groups and articles that somebody on your end of a
link actually wants to read. A more systemic example ought to be developed
whereby every group has a well-mirrored home and an nntpcache hierarchy
similar to what Squid proposed for web data, and every news reader pulls
only what it needs. Posting an article should mean getting it into the
well-mirrored home of that group. Removing spam should mean deleting
articles from the well-mirrored home of that group.

Pushing netnews, with or without multicast, with or without binaries, is
just unthinkable at today's volumes but we do it anyway. The effect of
increased volume have decreased the utilization of netnews as a media
amongst my various friends. Pushing netnews after another three or four
doublings is so far beyond the sane/insane boundary that I just know it
won't happen, Moore or not. It's well and truly past time to pull it
rather than push it.

Once upon a time, Jared Mauch <jared@puck.Nether.net> said:

  The problem is that the "tier 2" and whatnot providers have [mostly]
missed the boat on it and don't have people that are able to configure
their routers for mbgp. (this is not to say all but most).

  There are a lot of resources for getting help in fixing and
deploying multicast these days. It'd be nice to see more people turning it
on.

Okay, I'm with a (decent sized, although probably small by NANOG
measures) ISP. Why do I want to turn on multicast (what does it get me
that I don't have today)? And if I want to turn it on, where can I find
the resources you mention?

These are meant as honest questions (not intended as rhetorical or
argumentative questions). I want to understand why I should do this.

Pull it, rather than pushing it. nntpcache is a localized example of how

[...]

Proposed by someone every couple of months for the last 10 years (at
least). The current software (diablo especially) even supports it to a
good extent, however nobody is doing it for some reason.

Pushing netnews, with or without multicast, with or without binaries, is
just unthinkable at today's volumes but we do it anyway. The effect of
increased volume have decreased the utilization of netnews as a media
amongst my various friends.

Totally wrong on the non-binaries feed bit. A non-binaries feed is around
1-2GB per day or 100-200kb/s which is below the noise level for anyone on
this list. Even on the semi 3rd world wages I make I could afford a
non-binaries feed to my house and archive it for less than I spend on
lunches.

Binaries on the other hand is completely different, most people can't
afford it and we are moving to a centralized model with the supernews
types companies being the only ones with full feeds out there.

I am really surprised that the RIAA and similar groups havn't "gone after"
usenet to any great degree yet. I can't really see how binaries newsgroups
different in any great extent (from the copyright angle) from your random
p2p network.

Once a few lawsuits are issued (does the ISC count as a distributor?)
against the dozen or so top news providers things could be quite
interesting.