Problems sending mail to yahoo?

Joe_Greco · April 13, 2008, 10:58pm

> I would have thought it was obvious, but to see this sort of enlightened
> ignorance(*) suggests that it isn't: The current methods of spam filtering
> require a certain level of opaqueness.

Indeed, that must be the problem.

But then you proceed to suggest:

> So, on one hand, we have the "filtering by heuristics," which require a
> level of opaqueness, because if you respond "567 BODY contained www.sex.com,
> mail blocked" to their mail, you have given the spammer feedback to get
> around the spam.

Giving the spammer feedback?

In the first place, I think s/he/it knows what domain they're using if
they're following bounces at all. Perhaps they have to guess among
whether it was the sender, body string, sending MTA, but really that's
about it and given one of those four often being randomly generated
(sender) and another (sender MTA) deducible by seeing if multiple
sources were blocked on the same email...my arithmetic says you're
down to about two plus or minus.

In many (even most) cases, that is only useful if you're sending a lot of
mail towards a single source, a variable which introduces yet *another*
ambiguity, since volume is certainly a factor in blocking decisions.
Further, if you look at the average mail message, you have domains based
on multiple factors, such as services to do open tracking (1x1/invisible
pixels, etc), branding, and many other reasons that there could be more
than a single domain in a single message. Further, once you're being
blocked, it may be implemented by-IP even though there was some other
metric that triggered the block.

Having records that allow a sender to go back and unilaterally determine
what was amiss may not be considered desirable by the receiving site.

But even that is naive since spammers of the sort anyone should bother
worrying about use massive bot armies numbering O(million) and
generally, and of necessity, use fire and forget sending techniques.

Do you mean to suggest that your definition of "spammer" only includes
senders using massive bot armies? That'd be mostly pill spammers,
phishers, and other really shady operators. There are whole other classes
of spam and spammer.

Perhaps you have no conception of the amount of spam the major
offenders send out. It's on the order of 100B/day, at least.

I have some idea. However, I will concede that my conception of current
spam volumes is based mostly on what I'm able to quantify, which is the
~4-8GB/day of spam we receive here.

That's why you and your aunt bessie and all the people on this list
get the same exact spam. Because they're being sent out in the
hundreds of billions. Per day.

Actually, we see significant variation in spam received per address.

Now, what exactly do you base your interesting theory that spammers
analyze return codes to improve their techniques for sending through
your own specific (not general) mail blocks? Sure they do some
bayesian scrambling and so forth but that's general and will work on
zillions of sites running spamassassin or similar so that's worthwhile
to them.

I'm sure that if you were to talk to the Postmasters at any major ISP/mail
provider, especially ones like AOL, Hotmail, Yahoo, and Earthlink, that
you would discover that they're familiar with businesses which claim to be
in the business of "enhancing deliverability."

However, what I'm saying was pretty much the inverse of the theory that you
attribute to me: I'm saying that receivers often do NOT provide feedback
detailing the specifics of why a block happened. As a matter of fact, I
think I can say that the most common feedback provided in the mail world
would be notice of listing on a DNS blocking list, and this is primarily
because the default code and examples for implementation usually provide
some feedback about the source (or, at least, source DNSBL) of the block.

You'll see generic guidance such as the Yahoo! error message that started
this thread ("temporarily deferred due to user complaints", IIRC), but
that's not particularly helpful, now, is it. It doesn't tell you which
user, or how many complaints, etc.

But what, exactly, do you base your interesting theory that if a site
returned "567 BODY contained www.sex.com" that spammers in general and
such that it's worthy of concern would use this information to tune
their efforts?

Because there are businesses out there that claim to do that very sort of
thing, except that they do it by actually sending mail and then checking
canary e-mail boxes on the receiving site to measure effectiveness of their
delivery strategy. Failures result in further tuning.

Being able to simply analyze error messages would result in a huge boost
for their effectiveness, since they would essentially be able to monitor
the deliverability of entire mail runs, rather than assuming that the
deliverability percentage of their canaries, plus any open tracking,
indicated the actual delivery success rate.

I would have expected this to be stunningly obvious to anyone discussing
deliverability.

This is not an existence proof, one example is not sufficient, it has
to be evidence worthy of concern given O(100 billion) spams per day
overwhelmingly sent by botnets which are the actual core of the actual
problem.

No, it doesn't. Don't be silly. There are spammers who are flooding the
system, and hope to get mail through using sheer bulk. These guys aren't
caring to stick around and listen to the result code. They've got their
infected PC armies with however many hundreds of threads of spam-blasting
gooness they can squeeze out of each, and they're pounding the hell out
of recipients. They have a vested interest in not being easy to track
back, so that's why we get so much fun broken spam with broken payloads.
OBVIOUSLY they're not going to be listening for result codes.

But that doesn't mean that every spammer works that way. There are entire
e-mail service providers based on the principles of sending vast amounts
of non-opt-in email. Spamhaus has a lot of information on the biggest of
these. They exist.

I say you're guessing, and not very convincingly either.

I'm not guessing. Go visit Spamhaus.

> So you have two opaque components to filtering. And senders are
> deliberately left guessing - is the problem REALLY that a mailbox is full,
> or am I getting greylisted in some odd manner?

Except that most sites return some indication that a mailbox is
full. It's just unfortunately in the realm of heuristics.

There are sites that return "mailbox full" for a variety of cases.

But look into popular mailing list software packages (mailman,
majordomo) and you'll see modules for classifying bounce backs
heuristically and automatic list removal (or not if it seems like a
temporary failure, e.g., mailbox full.)

Right. Except that it's quite a bit more complex than that. A typical
E-Mail Service Provider ("ESP") has an extensive system for dealing with
known brokenness at various mailbox providers, and very few ESP's are
willing to drop a subscriber from a list for a single bounce.

Now, of course, ESP's range from the whitehat (for those who missed it,
Rodney Joffe founded "whitehat.com" a long time ago) to the greys, and
all the way on down to the blackhats. There are certainly a lot of ESP's
that attempt to implement various levels of "opt in" and "permission
based" e-mailing, but there are also those that pretty much spam
unapologetically.

Bounce processing is complicated for them all. Even the blackhats have
significant cause to carefully analyze return codes and try to divine
some greater meaning, because if they get blocked, their delivery rates
go down.

> Filtering stinks. It is resource-intensive, time-consuming, error-prone,
> and pretty much an example of something that is desperately flagging "the
> current e-mail system is failing."

And standardized return codes (for example) will make this worse, how?

Standardized return codes (assuming any meaningful amount of detail was
included) would make it easier for spammers to determine how their mail
was being filtered, and to evade accordingly.

That's a tragedy, because for legitimate senders, it means that they /also/
do not get automatic feedback on what they could do differently.

I *suspect* that avoiding providing too much feedback may be why a certain
percentage of e-mail simply vanishes at certain mailbox providers (cough,
Hotmail, cough).

> You want to define standards? Let's define some standard for establishing
> permission to mail. If we could solve the permission problem, then the
> filtering wouldn't be such a problem, because there wouldn't need to be as
> much (or maybe even any). As a user, I want a way to unambiguously allow
> a specific sender to send me things, "spam" filtering be damned. I also
> want a way to retract that permission, and have the mail flow from that
> sender (or any of their "affiliates") to stop.

Sure, but this is pie in the sky.

Sure.

For starters you'd have to get the spammers to conform which would
almost certainly take a design which was very difficult not to conform
to, it would have to be technologically involuntary. Whitelists are
the closest I can think of but they haven't been very popular and for
good reasons.

Sure. The spammers stand to lose. Given a system where end users can
revoke permission, they know that end users will. The current system,
even at 99% rejection rates, is preferable because they can get through
to a small percentage.

Unfortunately, legitimate senders suffer under the current model.

Anyhow, the entire planet awaits your design.

I didn't say I had a design. Certainly there are solutions to the
problem, but any solution I'm aware of involves paradigm changes of
some sort, changes that apparently few are willing to make.

A set of standardized return codes was carefully chosen by me as
something which could be (other than the standards process itself)
adopted practically overnight and with virtually zero backwards
compatability problems (oh there'll always be an exception.)

Sure. Anyone could do this. It's trivial. Perhaps there's a reason
that virtually no one implements something like this. (Hm!)

> Right now I've got a solution that allows me to do that, but it requires a
> significant paradigm change, away from single-e-mail-address.

There's nothing new in disposable, single-use addresses (or credit
card numbers for that matter, a different realm) if that's what you
mean but if you have something more clever the world (i.e., the big
round you see when you look down) is your oyster.

I'm currently working towards a model where I deploy an address per site,
which isn't a single-use model by any means. As a matter of fact, it's
a model that allows that address to be "shared" (even abusively) by the
senders, but at the point I decide to revoke permission, permission goes
away for _everyone_ sending to that address. So it _is_ disposable, in
the conventional sense.

It brings the permission control aspect back squarely under my control,
not under some random ESP's decision about whether or not to send to me.

Consider the benefits for deliverability if a major ISP implemented
something like this. Provide a facility for users to be able to get
disposable addresses (preferably ones where the "disposable" portion
could be handled prior to hitting the mail server, i.e. in DNS), and
then guarantee to both users and senders that no mail sent to these
addresses would be subject to spam filtering, rate limits, or other
arbitrary things, on the basis that the subscriber clearly asked for
the material. Revocation of permission would be available to the user,
through the simple process of eliminating the DNS record for that
particular disposable address.

Quite frankly, this is almost the scenario that started me on this in
the first place, because I was having such a devil of a time with getting
our anti-spam measures to not trip on invoices and other "legitimate"
stuff that arrives here, much of which is nearly indistinguishable, at the
machine level, from spam.

Despite being a viable solution to a large portion of the e-mail
deliverability puzzle, my best guess is that no ISP actually wants to
incur the cost and support hit of trying to get their users to use such
a system. The current system, where users simply sigh and accept that
they may not get their e-mail, is apparently preferable. It's certainly
easier. Lower the expectations rather than try to fix the problem.

That's fine, but then I'd really like them to be honest about it, and just
admit that they're not so concerned about actually delivering desired mail
as they are about keeping their costs as low as possible (etc.)

> Addressing "standards" of the sort you suggest is relatively meaningless
> in the bigger picture, I think. Nice, but not that important.

Well, first you'd have to indicate that you actually have a view of
the problem which supports such a judgment.

At any rate you're quibbling the example as I forewarned.

But standardizing receiving MTA fail codes is, I suspect, more useful
than you give them credit. It would be some progress at little to no
cost in the large.

By all means, then. Go ahead. You'll amaze me if you can actually get
this implemented at any major ISP or mailbox provider. It would be nice
for my cold and cynical viewpoints to be disproven, rather than to be
proven as too optimistic.

It deals less with spam filtering and more with effective MTA to MTA
operation.

That's not how the large ISP/mailbox providers will see it.

At least it's sticking to the realm of improving standards in a way
that can be accomplished.

I don't see how I could have given a better example without a lot of
hand-waving and vagaries.

Look, I certainly agree that it'd be *nice*, but there are lots of things
that are *nice* that aren't going to happen. Shall we beat the BCP38
horse any further?

There's a long history of things that would be nice that never come to
pass. I've already written off reliable deliverability at large ISP's as
one of those things. I'm now looking towards solutions to enable reliable
deliverability at smaller sites where principles might still matter enough
that people haven't completely written off e-mail as unusable.

... JG

Barry_Shein1 · April 14, 2008, 12:04am

Massive quoting gets old fast so I'll try to summarize and if I
misrepresent your POV in any way my profuse apologies in advance.

First and foremost let me say that if we had a vote here tomorrow on
the spam problem I suspect you'd win but that's because most people,
even (especially) people who believe themselves to be technically
knowledgeable, hold a lot of misconceptions about spam. So much for
democracy.

I say the core problem in spam are the botnets capable of delivering
on the order of 100 billion msgs/day.

You say there are other kinds of spammers.

I'll agree but if we got rid of or incapacitated the massive botnets
that would be a trickle, manageable, and hardly be worth fussing
about, particularly on an operational list.

The reason is that without the botnets the spammers don't have address
mobility. You could just block their servers.

But if we don't agree on those points then we're talking past each
other.

I assert that the problem are the massive O(100B) botnet spammers and
they simply don't have the resources or interest really (because they
don't have the resources or business model) to do things like analyze
return codes etc as you describe.

So it's doubtful to me that returning more meaningful return codes in
SMTP rejections would be of much use to them.

It's also not of much use to them, as I previously described, even if
they tried. They could deduce about the same information for about the
same "price" without the return codes.

But any such return codes should be voluntary, particularly the
details, and a receiving MTA should be free to respond with as much or
as little information as they are comfortable with right down to the
big red button, "421 it just ain't happenin' bub!"

But it was just an example of how perhaps some standards, particularly
regarding mail rejection, might help operationally. I'm not pushing
the particular example I gave of extending status codes.

Also, again I can't claim to know what you're working on, but there
are quite a few "disposable" address systems in production which use
various variations such as one per sender, one per message, change it
only when you want to, etc. But maybe you have something better, I
encourage you to pursue your vision.

And, finally, one quote:

I didn't say I had a design. Certainly there are solutions to the
problem, but any solution I'm aware of involves paradigm changes of
some sort, changes that apparently few are willing to make.

Gosh if you know of any FUSSP* whose only problem is that it requires
everyone on the internet to abandon SMTP entirely or similar by all
means share it.

Unfortunately this is a common hand-wave, "oh we could get rid of spam
overnight but it would require changes to (SMTP, usually) which would
take a decade or more to implement, if at all!"

Well, since it's already BEEN a decade or more that we've all been
fussing about spam in a big way maybe we should have listened to
people with a secret plan to end the war back in 1998. So I'm here to
tell ya I'll listen to it now and I suspect so will a lot of others.

* FUSSP - Final and Ultimate Solution to the Spam Problem.

Steve_Atkins · April 14, 2008, 12:27am

Address mobility doesn't buy you that much. It's relatively easy to mechanically
detect, and block, IP addresses that source mail solely from spam-related
botnets. (Not easy in the absolute sense, but easier than other problems
and, mostly, a solved one). Botnet sourced mail generally doesn't get
seen much by recipients at ISPs with competent spam filtering. It sure can
cause other operational problems, but in terms of being a "spam problem"
it's not the biggest one out there.

Blocking unwanted mail from sources that send a mixture of wanted
and unwanted mail, while still allowing the wanted mail through is
extremely difficult, and a much, much harder problem for spam
mitigation to solve. And those are primarily the non-botnet sources.

Spam filtering at real ISPs with real recipients has to deal with the
fact that recipients do want to read some of the mail they're sent
from Gmail, Yahoo Groups, Topica and suchlike.

Cheers,
Steve

Rich_Kulawiec · April 14, 2008, 3:48am

A number of things that are true, including:

I say the core problem in spam are the botnets capable of delivering
on the order of 100 billion msgs/day.

But I say the core problem is deeper. Spam is merely a symptom of an
underlying problem. (I'll admit that I often use the phrase "spam
problem" but that's somewhat misleading.)

The problem is pervasive poor security. Those botnets would not exist
were it not for nearly-ubiquitous deployment of an operating system that
cannot be secured -- and we know this because we've seen its own vendor
repeatedly try and repeatedly fail. But a miserable excuse for an OS is
just one of the causes; others have been covered by essays like Marcus
Ranum's "Six Dumbest Ideas in Security", so I won't attempt to enumerate
them all.

That underlying security problem gives us many symptoms: spam, phishing,
typosquatting, DDoS attacks, adware, spyware, viruses, worms, data
loss incidents, web site defacements, search engine gaming, DNS cache
poisoning, and a long list of others. Dealing with symptoms is good:
it makes the patient feel better. But it shouldn't be confused with
treatment of the disease. Even if we could snap our fingers and stop
all spam permanently tomorrow (a) it wouldn't do us much good and
(b) some other symptom would evolve to fill its niche in the abuse ecosystem.

A secondary point that actually might be more important:

We (and I really do mean 'we" because I've had a hand in this too)
have compounded our problems by our collective response -- summed up
beautifully on this very mailing list a while back thusly:

  If you give people the means to hurt you, and they do it, and
  you take no action except to continue giving them the means to
  hurt you, and they take no action except to keep hurting you,
  then one of the ways you can describe the situation is "it isn't
  scaling well".
    --- Paul Vixie on NANOG

We need to hold ourselves accountable for the security problems in
our own operations, and then we need to hold each other accountable.
This is very different from our strategy to date -- which, I submit,
has thoroughly proven itself to be a colossal failure.

---Rsk

Greg_Skinner1 · April 14, 2008, 5:11am

A number of things that are true, including:

> I say the core problem in spam are the botnets capable of delivering
> on the order of 100 billion msgs/day.

But I say the core problem is deeper. Spam is merely a symptom of an
underlying problem. (I'll admit that I often use the phrase "spam
problem" but that's somewhat misleading.)

The problem is pervasive poor security. Those botnets would not exist
were it not for nearly-ubiquitous deployment of an operating system that
cannot be secured -- and we know this because we've seen its own vendor
repeatedly try and repeatedly fail. But a miserable excuse for an OS is
just one of the causes; others have been covered by essays like Marcus
Ranum's "Six Dumbest Ideas in Security", so I won't attempt to enumerate
them all.

Is there a (nontrivial) OS that can be secured inexpensively, ie. for
the price that is paid for by shoppers at your local big box outlet?
To me, that's as much the problem as anything else that's been written
so far. The Internet is what it is largely because that is what the
users (collectively) will pay for. Furthermore, it's not so much the
OS as it is the applications, which arguably might be more securable
if Joe and Jane User took the time to enable the security features
that are available for the OSes they buy. But that doesn't happen. I
don't blame Joe and Jane User; most nontechnical people do not view
their home or work systems as something more than an appliance for
getting work done or personal entertainment.

A secondary point that actually might be more important:

We (and I really do mean 'we" because I've had a hand in this too)
have compounded our problems by our collective response -- summed up
beautifully on this very mailing list a while back thusly:

  If you give people the means to hurt you, and they do it, and
  you take no action except to continue giving them the means to
  hurt you, and they take no action except to keep hurting you,
  then one of the ways you can describe the situation is "it isn't
  scaling well".
    --- Paul Vixie on NANOG

We need to hold ourselves accountable for the security problems in
our own operations, and then we need to hold each other accountable.
This is very different from our strategy to date -- which, I submit,
has thoroughly proven itself to be a colossal failure.

One of the things I like about this list is that it consists of people
and organizations who DO hold themselves accountable. But as long as
it's not the collective will of the Internet to operate securely, not
much will change.

--gregbo

Bandy_Rush1 · April 14, 2008, 6:31am

if we got rid of or incapacitated the massive botnets that would be a
trickle, manageable, and hardly be worth fussing about, particularly
on an operational list.

this presumes non-inventive spammers, which i fear is not the case. but
it sure would be a good place to start

randy