RE: ISP wants to stop outgoing web based spam

Jeroen_Massar1 · August 9, 2006, 1:59pm

You mean Captcha (http://en.wikipedia.org/wiki/Captcha)

Which is not so much of an issue:
http://sam.zoy.org/pwntcha/

Otherwise simply setup a resource that people want to access (always the
best example on the internet: a pr0n site) and present the image there
and let them answer it for you

Hmm maybe I should look into hooking pwntcha into SA.

Greets,
Jeroen

(who now will receive another gateway@blogger.com response that it
doesn't understand multipart/signed messages.... can some
nanog-list-admin remove that crappy thing?)

Matthew_Black · August 9, 2006, 3:44pm

Use of "captchas" has serious accessibility issues:0
visually-impaired users will have trouble completing forms.
From a legal standpoint, this is a no-go and most definitely
not possible for any government or public-sector agency in
the United States. Several web accessibility regulations
prohibit impairments.

matthew black
network services
california state university, long beach
1250 bellflower boulevard
long beach, ca 90840-0101

Barry_Shein1 · August 9, 2006, 6:15pm

I think what was being talked about was that a lot of spam now comes
as embedded images which unpack into ads for the usual stuff. It's
actually been going on for a few years but I guess as the other stuff
gets more and more effectively blocked this form becomes more salient.

Thus far I don't know of any good filter for these.

Common spam software seems to rotate or vary these slightly so it's
not as simple as comparing to one you've seen before. Since the image
formats are compressed, usually gif, tiny changes can ripple through
the entire encoding.

David_Andersen · August 9, 2006, 6:24pm

Now we'll have to throw our inbound email through an OCR.

Then the spammers will start rotating the text or changing the background.

So we'll write a better OCR that can see through such transformations.

At which point, the spammers will be happy, because we'll have given them a tool to break Captchas.

Hmmm...

(Or just reject mail with images in it.

-Dave

Paul_Jakma · August 9, 2006, 10:22pm

Ditto for at least one EU jurisdiction, and likely several more of them.

I can't quite remember if there already is a directive issued, but there definitely was/is an EU working group looking at a variety of equality issues.

In Ireland, captchas would likely contravene the Equal Status Act of 2000 with respect to providing services, which applies to *all* persons and bodies. I believe the UK may have similar legislation in force (though I can't recall the name of the act).

Turing tests can /easily/ be implemented in ASCII, which is compatible with screen readers used by the visually impaired.

regards,

Paul_Jakma · August 10, 2006, 12:14am

Just ask the user some basic question. E.g.:

What is 2 added to 23?: <textbox>

regards,

Simon_Waters1 · August 10, 2006, 7:51am

I've no doubt some captcha can be invented in ASCII, but this isn't it. AI
already substantially out performs all but a small minority of humans on
mathematical style IQ test (they were over 160 when I was a kid), and it
would be relatively trivial to code it to handle the types of questions for
this kind of test.

It would work for a minority use. Indeed I've already used a BBS that expected
you to understand about factoring numbers or some such question on joining.

Something requiring real world knowledge would be better, but it is very hard
to automatically generate questions (and answers), that can't be
automatically answered. And remember in most cases you need questions that
are consistently hard, as the machines won't get bored retrying. If you
generate them manually (at least the first time one is encountered). Visual
noise (and auditory noise) is something we are good are consistently good at
removing,and machines are still playing catch-up. But then some of the
automated captcha solvers aren't that much worse than a lot of people.

On the upside such captchas might spark more research into AI, as whilst
recognising badly mangled images of text is kind of useful for the post
office and other handwriting recognition, it has limited applications
elsewhere.

Paul_Jakma · August 16, 2006, 12:13am

I've no doubt some captcha can be invented in ASCII, but this isn't it.

'tis. It works for at least one blog platform, where I've never once had comment spam.

a kid), and it would be relatively trivial to code it to handle the types of questions for this kind of test.

Sure, so change the questions.

The ultimate "captcha defeating AI" is already in-use by spammers by the way - humans (get humans to "solve" captchas in return for some reward, e.g. porn). ASCII or image matters not a jot to those.

ASCII captches are no less effective than image-captcha just without the nasty "ban the blind from the internet!" side-effects.

regards,

Matthew_Sullivan · August 16, 2006, 1:56am

Paul Jakma wrote:

ASCII captches are no less effective than image-captcha just without the nasty "ban the blind from the internet!" side-effects.

Then again you have Authen::Captcha that has sound based Captcha's as well....

/ Mat

Simon_Waters1 · August 16, 2006, 8:21am

You snipped the bit where I said "It would work for a minority use."

I'm sure it works fine for just you, but it doesn't scale, so the folks at
Nanog probably don't care.

The reason people use image recognition is it is something (most) humans find
very easy, but requires considerable investment of effort (or resource for
self training) to teach computers, and readily permits of variations ('click
the kitten' being a good example).

For a demonstration of bashing at ASCII captchas try any good chat bot.

I asked the online bot at ellaz.com your question:

"What is 2 added to 23?"

Ellaz replied;

"I can tell you that 2, plus 23, is equal to 25"

I hope your parser can recognise that as a valid answer, otherwise you'll have
trouble with humans failing the test. Although for blog comments, excluding
stupid, or overly verbose humans may not be a bad idea, I just get the
feeling some days I'd never get to comment on anyones blog.

I thought maybe spice it up a little;

Simon: "What is the square root of -1?"
Ellaz: "Hey Hey! You cannot take the square root of a negative number. That
gives an imaginary number, and I don't go there."

(Spot the canned response).

Shucks. Unfortunately Ellaz bot isn't terribly good at non-maths questions,
but I think it makes the point well enough.

The reason no one defeated your text captcha was probably because no one
tried, but that won't remain the case if it gets popular. We are locked in
another arms race here. At the moment greylisting kills most of your email
spam, and any captcha (even ones for which programs exists for, and which
score better than humans) will kill most of your blog spam, but don't expect
them to last as a defence, just as greylisting is slowly crumbling. The real
solution is to break the monoculture, and have more security at the leaf
nodes, but someone already started that thread (again).

Although possibly the mistake is to assume you can distinguish between humans,
and computers on the basis of intelligence. It isn't reliably possible to do
this yet, but give it a few years and you'll know that if a site asks for all
the integer solutions of a given quintic equation, it is probably not that
interested in comments from apes, except perhaps the most exceptional apes.

Richard_A_Steenbegen · August 16, 2006, 8:32am

How many CAPTCHA tests can a human making minimum wage complete in an
hour? Ask the post office people who input handwritten zipcodes.

A tougher question might be, what does any of this have to do with NANOG?

Paul_Jakma · August 16, 2006, 5:51pm

You snipped the bit where I said "It would work for a minority use."

Sorry, don't think that is relevant really - least I have no data on what minority uses are for captchas, nor majority uses or what the difference is.

The reason people use image recognition is it is something (most) humans find very easy, but requires considerable investment of effort (or resource for self training) to teach computers, and readily permits of variations ('click the kitten' being a good example).

Those need vast numbers of "kitten" pictures in order to be immune to dictionary attacks. There's a reason 'captchas' consist of auto-generated images of letters.

You can auto-generate questions too, obviously. With dictionaries of question/answer tuples associated with some template question language.

The tuples can be auto-generated, the strength lies in the variety of the question forms in use across the internet and/or across a site. The questions need not use language, they could be based on ASCII pattern matching, e.g.:

oAwoZwoLwoC

what's the next letter, etc..

Or you could simply test people on their ability to google perhaps?

For a demonstration of bashing at ASCII captchas try any good chat bot.

And for image captchas, see:

http://www.cs.sfu.ca/~mori/research/gimpy/

and there are more. CAPTCHAs are, almost by definition, compelling problems for academia to tackle ;).

The reason no one defeated your text captcha was probably because no one tried, but that won't remain the case if it gets popular. We are locked in another arms race here.

Yes, that applies regardless of the form of the captcha.

Although possibly the mistake is to assume you can distinguish between humans, and computers on the basis of intelligence.

Maybe so.

regards,