Research - Valid Data Gathering vs. Annoying Other

Date: Fri, 6 Aug 2004 14:09:01 -0400 (EDT)
From: John K Lerchey <lerchey@andrew.cmu.edu>
To: nanog@merit.edu
Subject: Research - Valid Data Gathering vs Annoying Others

Hi NANOG folks,

We have a situation (which has come up in the past) that I'd like some
opinions on.

[[.. $ mount /dev/soapbox # you have been warned. ..]]

Periodically, we have researchers who develop projects which will do
things like randomly port probe off-campus addresses. The most recent
instance of this is a group studying "bottlenecks" on the internet. Thus,
they hit hosts (again, semi-randomly) on both the commodity internet and
on I2 (abeline) to look for places where there is "traffic congestion".

The problem is that many of their "random targets" consider the probes to
be either malicious in nature, or outright attacks.

Why not? "Their network, *THEIR* rules."

*HOW* is one supposed to tell a 'benign' probe from a 'hostile' one,
when it is addressed to a machine that doesn't exist, or to a 'service'
that doesn't exist on an existant machine?

With all the 'overtly hostile' traffic out there, why on earth would anyone
consider that, with regard to 'unexpected'/'abnormal' traffic, there should
be _any_ 'expectation of innocence'?

Surely you don't think that the 'recipient' needs to do a _complete_analysis_
of "what was being attempted, and why" -- including making a determination of
the 'intentions' of the perpetrator -- for -every- 'unauthorized' attempt
to use their network, before complaining about the fact of an attempt at
'unauthorized use'?

I have a very _simple_ rule -- if it isn't intended for a service I make
available, on a machine I let the world have access to, then it is, _by_
definition_, an attempt to access that machine 'without authorization, or
in excess of the authorization granted'. Because the -only- 'authorized
use is those things whiich I expressly let past my firewall. Ergo, if
the firewall blocks it, it _IS_ an 'unauthorized access' attempt.

Whereupon, 18 USC 1030 (b), becomes *very* relevant, given the language
of 18 USC (a) (2) (C). The minimum penalty is 'up to a year imprisonment'.
given any 'extenuating circumstances' and it could be up to 20 years.

On my _personal_ network, at home (a /29 -- big wow:), I currently see
well over FIFTEEN THOUSAND unauthorized probes per day. Of those, a
*maximum* of 1-in-four-thousand *might* "possibly" be legitimate.

I give people the 'benefit of the doubt', and assume that these probes
are coming from virus-infected (unbeknownst to the owner) machines, rather
than 'deliberate, with malice aforethought' hacking attempts by the machine's
owner.

HOWEVER, that notwithstanding, *EVERY*ONE* gets reported to the responsible
_network_operator_ -- as an 'apparent virus-infected machine on your network',
With the relevant supporting documentation, and a simple request that the
machine be disabled from external network access until it has been sterilized
and secured against further infection.

The reporting is mostly to help the other operators keep _their_ networks
clean. And to get those machines off-line -- so that they cannot infect
other 'unprotected' machines. I'm confident _my_ network is adequately
protected. <grin>

Note: I "don't care" _what_ the 'name' of the machine is -- I don't even
check for rDNS, I look up the registered netblock _owner_ of the IP address,
at the RIR. And THAT is where the complaint reports go.

                                                     As a result of this,
we, of course, get complaints.

Deservedly so.

One suggestion that I received fro a co-worker to help to mitigate this is
to have the researchers run the experiments off of a www host, and to have
the default page explain the experiment and also provide contact info.

People are supposed to 'take it on faith' that what the website _says_ about
what is going on _is_ what is *actually* happening?

I hope you don't mind if I laugh -- Computerized 'social engineering', in
an attempt to deflect complaints, _is_ a humorous concept.

Do you *really* think that anybody is going to bother to go look to see
_what_ the source system 'claims' is the reason it is doing what it is
doing?

If the traffic isn't a webserver _response_, then the fact that it comes
from a machine named 'www.{something}' just means that something *unrelated*
to the webserver is also running on that machine. And, therefore, no reason
to believe that the webserver at that (coincidentally same) address would
have any information whatsoever about the 'offensive' behavior observed.)

I wouldn't even know *IF* an 'offending' machine had such a name. I don't
do rDNS look-ups on any of the addresses I send complaints off about.

We also discussed having the researchers contact ISPs and other large
providers to see if they can get permission to use addresses in their
space as targets, and then providing the ISPs with info from the testing.

This is one of two _good_ approaches. "Get Permission. *FIRST*"

How do you view the issue of experiments that probe random sites? Should
this be accepted as "reasonable", or should it be disallowed? Something
in between?

"Private property is *private* property." The Internet consists
*exclusively* of private property. those who own the property get to
make the rules for -their- property. What 'everybody else' thinks are
'appropriate' rules is immaterial to how they run -their- property. (Well,
except that if 'everybody else' doesnt like the way you run your property,
they *are* free to choose to not let you visit _their_ property. :slight_smile:

If _I_ say that thus-and-such is an 'objectionable use' of *my* property,
nobody, but *nobody*, has any standing to contradict me.

Virtually _every_ AUP says that 'your' use of 'foreign' networks is subject
to what _they_ (the foreign netowrk operator) deems to be 'acceptable' use
of *their* network.

The fact that complaints _are_ being generated is *proof* that they do not
think that such is 'acceptable use'. And that, therefore, the perpetrators
(despite 'good intentions') *ARE*, in all liklihood, in violation of _their_
_own_ TOS/AUP.

What other suggestions might you have about how such experiments could be
run without triggering alarms?

That is *easy*. TRIVIALLY EASY. _rent_ a node on those foreign networks.
run probes _to_ the hosts *you* control. (This is the second good approach:
"Buy access.")

Voila! "No problem."

From a pure philosophical standpoint, 'random testing' is no different

than "spamming". Both rely on the use of "other people's resources",
*WITHOUT* the consent/permission of those other people, and covering the
costs of the resources involved.

Since the 'testee' is paying for fully half of the costs of the testing,
they must be consulted _in_advance_.

If you want to claim that the testing "isn't wrong" because it only costs
any testee an 'insignificant' amount, You better be prepared to accept
all the traffic from the spammers who use exactly the same 'defense'.

Executive summary:
   Method of choice: "Get Permission. *FIRST*."
   If that fails, try: "Buy Access."
   If =that= fails, then "Don't Do it!"

perhaps those more appreciative of the operational utility
of current networking research, and with some clue as to
the difficulties of gathering data to achieve the results
on which we operators are building our networks, would be
less strident in their attacks on experimental packets

randy

Robert,

Given the steep penalty, would you do me the favor of forwarding your netblocks so that I can null-route them now, and avoid any risk of trespassing on your network?

Thanks,
Bill.

So, I have to get permission before trying to communicate with any host on
the Internet? Shouldn't this be extended to all forms of communication?
You must have permission before mailing, phoning, or talking to anyone?

I wonder how to even go about that, because to get their permission I
likely would need to communicate with them.

I guess that's one way to squelch the usefulness and rampant growth of the
network.

Tony Rall

Robert Bonomi wrote:

<>*HOW* is one supposed to tell a 'benign' probe from a 'hostile' one,
when it is addressed to a machine that doesn't exist, or to a 'service'
that doesn't exist on an existant machine?

With all the 'overtly hostile' traffic out there, why on earth would anyone
consider that, with regard to 'unexpected'/'abnormal' traffic, there should
be _any_ 'expectation of innocence'?

Easy, they need to set the evil bit to 0
:wink:

In article <200408062005.i76K5wtq000971@host122.r-bonomi.com>, Robert Bonomi <bonomi@mail.r-bonomi.com> writes

Because the -only- 'authorized use is those things whiich I expressly let past my firewall. Ergo, if the firewall blocks it, it _IS_ an 'unauthorized access' attempt.

Do you publish the firewall rules, so that people can make sure they don't accidentally make unauthorised attempts? Or are they supposed to guess what you allow through? Which would seem a little harsh if the penalty for guessing wrong is goinging straight to jail.

I echo many of the sentiments expressed already in trashing your response,
and want to add the following:

To the original poster and others: Do host a web server on port 80 of the
machines involved in the probe.

if i think i might be under attack from machine X, i would definitely
not launch a browser to X, especially if i ran internet exploder.

i think it would be useful if the entry one got by

   whois -h whois.(arin|apnic|lacnic|ripe).net X

returned, among the usual bumph, a comment such as

   experiment being conducted, see http://Y/

randy

ip route executive summary null 0

: [[.. $ mount /dev/soapbox # you have been warned. ..]]

Yes...

: *HOW* is one supposed to tell a 'benign' probe from a 'hostile' one,
: when it is addressed to a machine that doesn't exist, or to a 'service'
: that doesn't exist on an existant machine?

Who cares? It's your network. If you don't want the traffic, block it.
Research, malicious, virii, or whatever.

: HOWEVER, that notwithstanding, *EVERY*ONE* gets reported to the responsible
: _network_operator_ -- as an 'apparent virus-infected machine on your network',

Waste of bandwidth. Borders on GWF.

: The reporting is mostly to help the other operators keep _their_ networks
: clean. And to get those machines off-line -- so that they cannot infect

There will never be enough fire in the world to make a lazy netadmin GUOTA
(Get Up Off Their A$$) It's a waste of bandwidth.

: This is one of two _good_ approaches. "Get Permission. *FIRST*"

In the old networks, but not now. That's silly in a globally connected
infrastructure.

: > How do you view the issue of experiments that probe random sites? Should
: > this be accepted as "reasonable", or should it be disallowed? Something
: > in between?

There ain't any better testbed that the real world; test away. If I don't
want them testing my network, I'll stop them. It's my network and I'll do
what I want with what I paid for.

scott

And, especially, make sure that your provider is aware of what you're
doing. Specifically that whoever answers abuse/security@your-domain,
and abuse/security@your-provider knows what you're doing. There will
always be GWFs[1] who send frivolous complaints to you or your provider,
regardless of how benign the traffic is. You ideally want to be in
the situation where your providers abuse desk blows them off, rather
than anyone expending any more time than it takes to hit delete in
the ticketing system.

Also be very sure that you understand what you're doing, and that it
will not cause others operational problems. Be prepared to apologize,
grovel and possibly offer financial compensation when your screwup
actually does inflict significant costs on someone else. If you're not
convinced enough that you're not going to break other peoples systems
that the idea of financial compensation scares you, you shouldn't be
sending the traffic in the first place.

While I can't imagine how any of the legitimate surveys would cause
anyone real operational costs (as opposed to the oversensitive IDS or
anal log reader problems) I have seen systems knocked offline in the
past by a postgrad "research project" that was run with more naive
enthusiasm than technical talent. Heck, the googlebot fell into a lot
of infinite trees and made webservers fall over before they got it
right, back when it was an academic research project.

Cheers,
  Steve

[1] Goober With Firewall. Originally from internal jargon at
    abuse@above.net - a complaint, for example, that "ns1.above.net
    is hackoring my port 53!" would be, and should still be, closed
    with the sole annotation being "GWF".

Gee. If one takes this approach, all research is criminal. The fact is, some
amount of important science and research and some larger amount of silly
research is going on as a result of these probes.

An earlier response stated that a web server should be run on the
transmitting host. This is probably a good idea, although people may not
check it. Another possibility is sending a disclaimer or explanation in the
payload of the transmitted packet, if possible.

On a side note, I suggest that Robert forward his complaints to the
appropriate US Attorney for immediate prosecution. I will be waiting, with
baited breath, for the mass of indictments. I'll especially relish the
bespectacled researchers and innocent zombie-attack victims, all doing the
perp walk in unison.

Sadly, this will not come to pass. Robert's interpretation of the law is
somewhat faulty. ICMP packets blocked as his firewall aren't normally
considered unauthorized use, except in the event of a DoS attack. If anyone
has case law that says differently, I'm sure we'd all love to see it.

- Dan

The PTR record for the IP(s) doing the testing is also a good place to
store some suggestive info. I would think most people motivated enough to
file a report would take the time to look up the PTR first or as part of
the complaint process.

1 PTR dont.like.our.packets.call.1-900-....... :slight_smile:

if i typo and enter your ip in my browser, thats illegal?

theres no way to tell the intentions of a received packet.. is it research, is
it hostile, is it an infected machine, is it an error

whilst its prudent to assume that anything received unexpectedly is malicious,
that doesnt necessarily follow that you should do anything more with it than
discard it.. you should obtain more data before deciding to take action

and whilst you may not like the research community carrying out unauthorized
probes it is these guys who made and maintain the internets systems, you should
cut them some slack in the name of research!

Steve