RBL for bots?

Has anyone created an RBL, much like (possibly) the BOGON list which includes the IP addresses of hosts which seem to be “infected” and are attempting to brute-force SSH/HTTP, etc?

It would be fairly easy to setup a dozen or more honeypots and examine the logs in order to create an initial list.

Anyone know of anything like this?

-Drew

A large percentage of those bots are in DHCP'ed cable/dsl blocks. As such,
there's 2 questions:

1) How important is it that you not false-positive an IP that's listed because
some *previous* owner of the address was pwned?

2) How important is it that you even accept connections from *anywhere* in
that DHCP block?

(Note that there *are* fairly good RBL's of DHCP/dsl/cable blocks out there.
So it really *is* a question of why those aren't suitable for use in your
application...)

Bots are rarely single purpose engines. If they have been detected doing bad things, they will probably appear in multiple RBLs for multiple
reasons. If something is in multiple RBLs, even if it hasn't done the particular badness you are looking for, its probably just a matter of time.

Perhaps not surprising, some of the porn site vendors appear to have the most sophisticated systems for detecting brute force/password sharing
attacks.

    Has anyone created an RBL, much like (possibly) the BOGON list which
includes the IP addresses of hosts which seem to be "infected" and are
attempting to brute-force SSH/HTTP, etc?

It would be fairly easy to setup a dozen or more honeypots and examine
the logs in order to create an initial list.

A large percentage of those bots are in DHCP'ed cable/dsl blocks. As such,
there's 2 questions:

1) How important is it that you not false-positive an IP that's listed because
some *previous* owner of the address was pwned?

2) How important is it that you even accept connections from *anywhere* in
that DHCP block?

That depends...

Do you sell "Internet service" to you customers or something else. If
the former then they're actually paying to receive connections from
anywhere...

Then the RBL is irrelevant, as "anywhere" isn't the same as "anywhere that
isn't in an RBL". :slight_smile:

(And anyhow, I'd *hope* that any use of an RBL to filter things on behalf of
a customer was spelled out in the contract, at least in the fine print that
most Joe Sixpacks never bother reading, specifically to cover that issue...)

Drew Weaver wrote:

    Has anyone created an RBL, much like (possibly) the BOGON list which includes the IP addresses of hosts which seem to be "infected" and are attempting to brute-force SSH/HTTP, etc?
It would be fairly easy to setup a dozen or more honeypots and examine the logs in order to create an initial list.
Anyone know of anything like this?

web.dnsbl.sorbs.net has hosts that do this as well as korgo infected machines, and a whole host of other types of vulnerabilities, trojans and bots.

Do be careful about how you use the data, we don't distinguish between the types for very good reason.

Regards,

Mat

> Has anyone created an RBL, much like (possibly) the BOGON list which
> includes the IP addresses of hosts which seem to be "infected" and are
> attempting to brute-force SSH/HTTP, etc?

No BL for bots other than SMTP zombies quite yet.

There is one for SSH brute forcing, although home-made.. J. Will repond on
his own...

> It would be fairly easy to setup a dozen or more honeypots and examine
> the logs in order to create an initial list.

A large percentage of those bots are in DHCP'ed cable/dsl blocks. As such,
there's 2 questions:

Quite right, which is why ...

1) How important is it that you not false-positive an IP that's listed because
some *previous* owner of the address was pwned?

As in, dynamic ranges BL.

2) How important is it that you even accept connections from *anywhere* in
that DHCP block?

Or maybe the cool concept of white-listing known senders? :slight_smile:

(Note that there *are* fairly good RBL's of DHCP/dsl/cable blocks out there.
So it really *is* a question of why those aren't suitable for use in your
application...)

Many of them are SMTP-based only. IP reputation is very limited still.

Now, all that said, back on "most are broadband users" - no longer
true. Many bots (especially in spam) are now web servers.

  Gadi.

I'm willing to bet that most are *still* broadband users. Quite likely,
even if 100% (yes, *every single last one*) of the "web servers" out there
were botted, that would likely still be less systems than if only 5% of end-user
systems were botted. Just a little while back, Vint Cerf guesstimated that
there's 140 million botted end user boxes. Unless 100% of Google's servers
are botted, there's no way there's that many botted servers. :slight_smile:

And the fact that web servers are getting botted is just the cycle of
reincarnation - it wasn't that long ago that .edu's had a reputation of
getting pwned for the exact same reasons that webservers are targets now:
easy to attack, and usually lots of bang-for-buck in pipe size and similar.

> Many of them are SMTP-based only. IP reputation is very limited still.
>
> Now, all that said, back on "most are broadband users" - no longer
> true. Many bots (especially in spam) are now web servers.

I'm willing to bet that most are *still* broadband users. Quite likely,

Oh, safe bet. :slight_smile:

even if 100% (yes, *every single last one*) of the "web servers" out there
were botted, that would likely still be less systems than if only 5% of end-user

But not less spam? :slight_smile:
I seriously doubt more spam is sent from web servers than user systems,
but it's changing. Web servers now play a part which we can notice and
measure.

systems were botted. Just a little while back, Vint Cerf guesstimated that
there's 140 million botted end user boxes. Unless 100% of Google's servers
are botted, there's no way there's that many botted servers. :slight_smile:

I kept quiet on this for a while, but honestly, I appreciate Vint Cerf
mentioning this where he did, and raising awareness among people who can
potentially help us solve the problem of the Internet.

Still, although I kept quiet for a while, us so-called "botnet
experts" gotta ask: where does he get his numbers? I would appreciate some
backing up to these or I'd be forced to call him up on his statement.

My belief is that it is much worse. I am capable of proving only somewhat
worse. His numbers are still staggering so.. where why when how what? (not
necessarily in that order).

So, data please Vint/Google.

And the fact that web servers are getting botted is just the cycle of
reincarnation - it wasn't that long ago that .edu's had a reputation of
getting pwned for the exact same reasons that webservers are targets now:
easy to attack, and usually lots of bang-for-buck in pipe size and similar.

You mean they aren't now? Do we have any EDU admins around who want to
tell us how bad it still is, despite attempts at working on this?

Dorms are basically large honey nets. :slight_smile:

  Gadi.

systems were botted. Just a little while back, Vint Cerf guesstimated that
there’s 140 million botted end user boxes. Unless 100% of Google’s servers
are botted, there’s no way there’s that many botted servers. :slight_smile:

I kept quiet on this for a while, but honestly, I appreciate Vint Cerf
mentioning this where he did, and raising awareness among people who can
potentially help us solve the problem of the Internet.

Still, although I kept quiet for a while, us so-called “botnet
experts” gotta ask: where does he get his numbers? I would appreciate some
backing up to these or I’d be forced to call him up on his statement.

My belief is that it is much worse. I am capable of proving only somewhat
worse. His numbers are still staggering so… where why when how what? (not
necessarily in that order).

So, data please Vint/Google.

Dr. Cerf wasn’t speaking for Google when he said this, so I’m not sure why you’re looking that direction for answers. But since you ask, his data came from informal conversations with A/V companies and folks actually in the trenches of dealing with botnet ddos mitigation. The numbers weren’t taken from any sort of scientific study, and they were in fact mis-quoted (he said more like 10%-20%).

so you go ahead an call him on it Gadi; you’re a “botnet expert” after all.

And the fact that web servers are getting botted is just the cycle of
reincarnation - it wasn’t that long ago that .edu’s had a reputation of
getting pwned for the exact same reasons that webservers are targets now:
easy to attack, and usually lots of bang-for-buck in pipe size and similar.

You mean they aren’t now? Do we have any EDU admins around who want to
tell us how bad it still is, despite attempts at working on this?

Dorms are basically large honey nets. :slight_smile:

spoken like someone who’s not actually spent time cleaning up a resnet. cleaning up a resnet must look downright impossible when you spend so much time organizing conferences.

(my opinions != my employer’s, etc. etc.)

Cheers,
.peter

> I kept quiet on this for a while, but honestly, I appreciate Vint Cerf
> mentioning this where he did, and raising awareness among people who can
> potentially help us solve the problem of the Internet.
>
> Still, although I kept quiet for a while, us so-called "botnet
> experts" gotta ask: where does he get his numbers? I would appreciate some
> backing up to these or I'd be forced to call him up on his statement.
>
> My belief is that it is much worse. I am capable of proving only somewhat
> worse. His numbers are still staggering so.. where why when how what? (not
> necessarily in that order).
>
> So, data please Vint/Google.

Dr. Cerf wasn't speaking for Google when he said this, so I'm not sure why

Okay, thansk for clarifying that. :slight_smile:

you're looking that direction for answers. But since you ask, his data came
from informal conversations with A/V companies and folks actually in the

Interesting.

trenches of dealing with botnet ddos mitigation. The numbers weren't taken

Botnet trenches? Yes, I suppose the analogy to World War I is correct. I
should know, I was there (metaphorically speaking). My guess is, if we are
to follow this analogy, we are now just before the invention of the tank
now in 2007, but oh well.

from any sort of scientific study, and they were in fact mis-quoted (he said
more like 10%-20%).

Interesting.

<snip poison>

<snip more poison>

(my opinions != my employer's, etc. etc.)

Many thanks,

Cheers,
.peter

  Gadi.

> And the fact that web servers are getting botted is just the cycle of
> reincarnation - it wasn't that long ago that .edu's had a reputation of
> getting pwned for the exact same reasons that webservers are targets now:
> easy to attack, and usually lots of bang-for-buck in pipe size and similar.

You mean they aren't now? Do we have any EDU admins around who want to
tell us how bad it still is, despite attempts at working on this?

OK, I'll bite. :slight_smile:

We point them at info:

http://www.computing.vt.edu/help_and_tutorials/getting_started/students.html

and give them a free CD that does all the heavy lifting for them:

http://www.antivirus.vt.edu/proactive/vtnet2006.asp

(And if you live in the dorms, the CD is *sitting there* on the table when
you get there - and the network jack has a little tape cover that reminds
them to use the CD first...)

Oh, and they also get to attend our "Don't be an online victim" presentation
during orientation, and most (if not all) of the residence halls have their
own official resident tech geek (it's amazingly easy to find people who
are willing to help people on their floor in exchange for a single room
rather than double :wink:

And after all that, at any given instant, there's probably several dozen botted
boxes hiding in our 2 /16s - there's a limit to what you can do to stop users
from getting themselves botted when it's their box, not yours. And there's
political expediency limits to what you can do to detect a botted box and take
action before it actually does anything.

What's changed over the past few years is that a number of years ago, the
end-user part of the Internet was /16s of .edu space with good bandwidth
interspersed with /18s of dial-up 56K modem pools, so .edu space was an
attractive target. Now the /18s of dial-ups are /12s of cablemodems and DSL,
and *everyplace* is the same attractive swamp that .edu's used to be.

And most ISPs don't provide in-house tech support and an orientation lecture
when you sign up - though some *do* provide the free A/V these days. :slight_smile:

Bottom line - there's cleaner /16s than ours. There's swampier. What's changed
is that in addition to Joe Freshman being online, Joe's parents and kid sister
are online too. I have *some* control over Joe - the other 3 are Somebody
Else's Problem, and all I can do is hope they use an ISP that's learned that
you can actually get a positive ROI on up-front investing in security.
Unfortunately, Vint tells me that 140 million of them are all over at that
*other* ISP. :wink:

Dorms are basically large honey nets. :slight_smile:

Are there any globally-routed /24s that *aren't*, these days? :wink:

Working a day on the help desk at the *other* ISPs, which ever ISP you
want to point fingers at, is always an eye-opening experience.

Even when you think things should be the same, they sometimes have very
different problems to solve.

> And most ISPs don't provide in-house tech support and an orientation lecture
> when you sign up - though some *do* provide the free A/V these days. :slight_smile:

Working a day on the help desk at the *other* ISPs, which ever ISP you
want to point fingers at, is always an eye-opening experience.

I hear enough from people who *do* work at Some Other Place. :slight_smile:

Even when you think things should be the same, they sometimes have very
different problems to solve.

Never claimed *our* solution would work everywhere (heck, I even admit it
isn't 100% effective for *us*). A very large chunk of what *we* do would be
doomed to failure at any organization where the problem set includes "make a
profit selling connectivity to cost-conscious general consumers".

I just often wish Vint's 140 million would switch to Some Other ISP where
the traffic I see from them didn't cause operational issues for *my*
organization. (And yes, that was carefully phrased - there's multiple
solutions that work for customer and ISP *and* get them off my radar. But
there's no *single* workable solution.)

Heya,

> And the fact that web servers are getting botted is just the cycle of
> reincarnation - it wasn't that long ago that .edu's had a reputation of
> getting pwned for the exact same reasons that webservers are targets now:
> easy to attack, and usually lots of bang-for-buck in pipe size and similar.

You mean they aren't now? Do we have any EDU admins around who want to
tell us how bad it still is, despite attempts at working on this?

Dorms are basically large honey nets. :slight_smile:

I run the network for a University with about 12,000 students and 12,000
computers in our dormitories. We, like many other Universities, have spent the
last five or six years putting systems in place that are both reactive and
preventative. From my perspective, the issues are still there but I'm not
sure that I agree with your implications.

Do we still have "compromised" systems? Yes.
Is the number of "compromosed" systems at any time large? No.
Is the situation out of control? No.

Email me off-list if you want more details. IMHO, Its too bad broadband
providers have not yet picked up on what the Universities have done.

Eric :slight_smile:

Heya,

> > And the fact that web servers are getting botted is just the cycle of
> > reincarnation - it wasn't that long ago that .edu's had a reputation of
> > getting pwned for the exact same reasons that webservers are targets now:
> > easy to attack, and usually lots of bang-for-buck in pipe size and similar.
>
> You mean they aren't now? Do we have any EDU admins around who want to
> tell us how bad it still is, despite attempts at working on this?
>
> Dorms are basically large honey nets. :slight_smile:

I run the network for a University with about 12,000 students and 12,000
computers in our dormitories. We, like many other Universities, have spent the
last five or six years putting systems in place that are both reactive and
preventative. From my perspective, the issues are still there but I'm not
sure that I agree with your implications.

Do we still have "compromised" systems? Yes.
Is the number of "compromosed" systems at any time large? No.
Is the situation out of control? No.

Email me off-list if you want more details. IMHO, Its too bad broadband

Will do, and also below...

providers have not yet picked up on what the Universities have done.

Thank you Eric. :slight_smile:

Can you elaborate a bit on what universities have done which would be
relevant to service providers here?

I hear enough from people who *do* work at Some Other Place. :slight_smile:

Hearing about it is not the same as experiencing it first-hand.

Never claimed *our* solution would work everywhere (heck, I even admit it
isn't 100% effective for *us*). A very large chunk of what *we* do would be
doomed to failure at any organization where the problem set includes "make a
profit selling connectivity to cost-conscious general consumers".

The Other ISPs do all of the things you mentioned, except they don't give
their techs free rooms. Instead they give out $50 or $100 gift cards for in-home or in-store techs from several consumer electronics chains to fix customer computers; which may be similar to the level of expertise you
would get from unpaid residential dorm techs. However, the environment and populations aren't necessarily comparable.

Understanding why those things have been doomed to failure is an important difference. It isn't because ISPs unwilling to try them. But instead
its because ISPs have tried those things (and many other things). They
fail not because of the cost side of the equation, but because they don't have much effect on the problem over the long-term in that environment and population.

If someone (vendor, academic, etc) comes up with something that works well for the environment and population facing the general public ISP,
there are a lot of ISPs with money constantly asking what can they buy/pay/do to fix it. However, they are also very skeptical, because
this is a well-travelled road, and they've seen a lot of claims that
didn't pan out.

Hear, hear. It's also too bad that there are still so many .edus without
rDNS that identifies their resnets and dynamic/anonymous space easily,
though the situation seems to be improving. Not knowing which .edu is
yours, I'll refrain from further comment, but I will give some examples
from some that I know about:

Good examples:
[0-9a-z\-]+\.[0-9a-z\-]+\.resnet\.ubc\.ca
[0-9a-z\-]+\.[0-9a-z]+\.resnet\.yorku\.ca
ip\-[0-9]+\.student\.appstate\.edu
r[0-9]+\.resnet\.cornell\.edu
ip\-[0-9]+\-[0-9]+\.resnet\.emich\.edu
[0-9a-z\-]+\.resnet\.emory\.edu
dynamic\-[0-9]+\-[0-9]+\.dorm\.natpool\.uc\.edu

Bad examples:
resnet\-[0-9]+\.saultc\.on\.ca
[0-9a-z\-]+\.(brooks|camp|congdon|cubley|graham|hamlin|moore|powers|price|townhouse|woodstock)\.clarkson\.edu
[a-z]+\.(andr|carm|ford|laws|stev|thom|ucrt)[0-9]+\.eiu\.edu
(linden|parkave|ruthdorm|ucrt|village)[0-9a-z]+\-[0-9a-z]+\.fdu\.edu
resnet[0-9]+\.saintmarys\.edu
[0-9a-z\-]+(aolcom|uncgedu)\.uncg\.edu **
(l[0-9]+stf|bl)[0-9]+\.bluford\.ncat\.edu

The general idea is, as has been mentioned before, to use a naming
convention that can easily be blocked in sendmail and other MTAs by the
simple addition of a domain tail or substring to an ACL, such as
'resnet.miskatonic.edu' or 'dyn.miskatonic.edu'. As interesting it can
be to explore the campus map trying to figure out whether a given DNS
token represents a lab, the administration building, the faculty lounge,
or a dorm, over and over again, there's gotta be some activity that is
more rewarding in the long run, such as skeet shooting or helping people
disinfect their computers (or, joy of joys - both simultaenously!)

** I'd like to single out uncg.edu for special ridicule here - I hope
they're still not doing this, but at one point over the last three years
at least, their DHCP addresses were comprised of the end user's email
address, sans '.' and '@', AS THE HOSTNAME in an otherwise non-subdomained
whole:

e.g., 'britney1986@aol.com' got the hostname 'britney1986aolcom.uncg.edu',
'billg@uncg.edu' got 'billguncgedu.uncg.edu', etc.

I'm sure the spammers who plague uncg.edu today didn't get their entire
computer-literate student body's addresses through an rDNS scan. After
all, not /all/ of the addresses were in uncg.edu. The rest were in AOLland
or at hotmail or a few other obvious freemail providers.

Then I think they're too small -- actually, I thought 140M was also
too small, but plausible.

A couple of years ago, I had a series of conversations with some people
who have insight into very large system populations. The question at
hand was "how many zombie'd boxes are out there?" and was intended to
yield some concept of distributed the spam problem had become.

We kept in mind the following: (a) zombies which do nothing observable
will escape external detection (b) zombies which do things, but direct
those things against hosts that aren't paying attention, will also escape
external detection and (c) zombies which do things, and direct those
things against hosts that are paying attention, but which are sufficiently
clever about how they do it, will also escape external detection.

Everyone used their methods and reasoning. We concurred that they were
probably on the order of ~100M zombies *just based on the spam we were
seeing*, i.e. ignoring everything else. (As in "order of magnitude".
I thought the number was perhaps 50% low; others thought it was
perhaps 50% high. So call it a ballpark estimate, no better.)

That was during the spring of 2005. I can't think of anything that's
happened since then to give me the slightest reason to think the number's
gone down. I can think of a lot of reasons to think the number's gone up.

I suggest everyone run their own experiment. Deploy something that does
passive OS fingerprinting (e.g. OpenBSD's pf) and just look at SMTP:
then correlate (a) whether the host tried to deliver spam or not
(b) detected OS type and (c) rDNS (if any exists). If you want to
fold in data from ssh brute-force attempts and the like, sure, go ahead.
Let it run for a month and collate results.

Alternatively, look at SYN packet rates and destination diversity for
outbound port 25 connections from those portions of your own networks
ostenibly populated with end users. Compare to what "normal" should
look like.

I've concluded three things (by doing experiements like that). (a) Where
there are Windows boxes, there are zombies. "Securing Microsoft operating
systems adequately for use on the Internet" is not a solved problem in
computing. (b) As of the moment, "the spam problem" nearly equates to "the
Microsoft insecurity problem". (Yes, there are non-Windows spam-sending
hosts, but most of those seem to be dedicated spammer servers, quickly
identified and blacklisted, thus not a serious threat to anyone who's
using a sane combination of DNSBLs.) (c) Amusingly, it's possible
to detect new end-user allocations and service rollouts by noting when
spam starts to arrive from them. (e.g. the Verizon FIOS deployment, if I
may use hostnames of the form *.fios.verizon.net as a guide, is going
well in NYC, Dallas, DC, Tampa, Philly, LA, Boston and Newark, but lags
behind in Seattle, Pittsburgh, Buffalo and Syracuse.)

---Rsk

We have similar problems here

I can talk offnet about the remediation tools and systems we use here many of which are cheap and applicable to a service provider environment as most large edu's are more
comparable to a small town service provide than a enterprise network.

we recently upgraded our DHCP/DNS system to the solution from vendor 'C' as part of this the general user systems were renamed this of course included the resnet systems

i.e. dhcp-0123456-78-10.[student|client].domain.edu

Steven Champeon wrote: