Potential downside to using (very) old domain as spam trap.

Hi,

I've recently been delegated a domain of a dead ISP which hasn't existed
in *any* form for about 5+ years. As a test, we setup an MX for it to see
what kind of mail it would get since we noted a lot of DNS lookups for it.

After going through a few hundred emails it started to look like the
domain might be good fodder for a blacklist. We couldn't find a single
legit email that passed through spamassassin and a couple other tools.

I've seen people put spamtraps on web pages and at the bottom of emails to
use as blacklist fodder but not a whole domain.

I suppose more rigorous testing could be done to make sure no legit email
is being sent to the domain, but I have a strong feeling that it is very,
very dead. (It even expired at one point and was available from a
registrar.)

Is this done? Advisable? Experiences?

Thanks in advance,
David Ulevitch

I've seen people put spamtraps on web pages and at the bottom of emails to
use as blacklist fodder but not a whole domain.
...
Is this done? Advisable? Experiences?

cix.net, which has been dead for a few years, gets about 50 messages a day
on its MX. the majority is spam, but there's always a handful of messages
from people who mistype "cox.net" as "cix.net", either when sending mail or
when signing themselves up for services (who usually don't do verification).

spamtrapping legitimate personal e-mail that happens to have a mistyped
destination address seems antisocial. but with cox.net's population
numbering in the apparent millions, error theory tells us how many such
mistyped destinataion addresses to expect. (and it's true in this case.)

therefore before you use whole-domain spamtrapping, i recommend looking VERY
carefully at the flows so that you can be sure that "i" isn't adjacent to
"o" on the qwerty keyboard, or some other such problem.

Paul Vixie wrote:

therefore before you use whole-domain spamtrapping, i recommend looking VERY
carefully at the flows so that you can be sure that "i" isn't adjacent to
"o" on the qwerty keyboard, or some other such problem.

Agreed. But I'll mention a situation where it's very valuable and show some of the pitfalls found thru intimate experience with doing it.

We decommissioned some of our domains about 3 years ago, as we transitioned to our current one.

At the time that these domains were decommissioned (de-MX'd), we were catching and tossing 60,000-70,000 spams per day.

18 months later, as an experiment, I re-enabled the domains.

First day: 600,000 spams per day. In the months since then it has grown to 2.5 million per day. We use this as a spamtrap.

The immediate temptation is to directly feed blacklists of some sort.

But:

1) A significant fraction (varies from 5-30%) is NDRs from innocent sites for spam forged with return addresses in our spamtrap.

[If <user@domain> is being spammed, chances are that it's being forged in spam too.]

2) A significant fraction is virus/worm attacks from people with very old addresses in their address books.

3) A significant fraction is email from otherwise legitimate sites who have a spam problem (so at least you have to ratio traffic levels between spamtrap and non-spamtrap). Case in point: MSN's DAV servers while they were scriptable (the ratios were absurdly bad (like 5000:1), we did end up blacklisting the DAV sites on-and-off over the past 3 months or so).

4) There is a growing fraction that turn out to be RCPT TO verifications from sendmail configurations (of spam forged with spamtrap domain names). I think this is dangerous (tripping harvesting detectors for example), but, it's a fairly effective heuristic in its own right.

5) There are a significant fraction of "autoresponders" responding to forged spam.

While much of this is detectable and you can remove it from the analysis, you generally need to apply additional heuristics beyond just the spamtrap.

We try to filter out bounces/viri, compute ratios of bad:good and look for verifications via other third party blacklists that we're unwilling or unable to use directly. It's also fodder for additional analysis that detects open relays, proxies and trojaned boxes.