Spam filtering bcps [was Re: Open Letter to D-Link about their NTP vandalism]

You can reject right after DATA, at the <CR><LF>.<CR><LF> stage, before QUIT

That's still an in line smtp reject rather than an accept + bounce DSN.

Exim with the spamassassin patches (sa-exim) does this, for example.

-srs

SpamAssassin support is built in to Exim since version 4.50.

Tony.

Suresh Ramasubramanian wrote:

Matthew Sullivan wrote:

Suresh Ramasubramanian wrote:

Are you suggesting that we configure our e-mail servers to notify
people upon automatic deletion of spam? Frequently, spam cannot be
properly identified until closure of the SMTP conversation and that
final 200 mMESSAGE ACCEPTED...or do you think that TCP/IP connection
should be held open until the message can be scanned for spam and
viruses just so we can give a 550 MESSAGE REJECTED error instead of
silently dropping it?

You can reject right after DATA, at the <CR><LF>.<CR><LF> stage, before QUIT

That's still an in line smtp reject rather than an accept + bounce DSN.

Exim with the spamassassin patches (sa-exim) does this, for example.

-srs

Of course Postfix can be setup (using spampd) with spamassassin to do exactly the same.

I believe Sendmail+MimeDefang+Spamassassin will also reject inline if set to do so.

Regards,

Mat

As will sendmail+spamass-milter+spamassassin

In fact there are quite a few milters that can be used in between sendmail and spamassassin

Joe

Several people kindly contacted me off list with laborious
explanations of how to implement delayed 550 rejections using
sedmail, et al. We gave up sendmail years ago in favor of a
competing solution.

I haven't seen any succinct justification for providing a
550 message rejection for positively-identified spam versus
silently dropping the message. Lots of how-to instructions
but no whys.

matthew black
california state university, long beach

For viruses - fine. But you are not going to find any spam filter in
the world that doesnt have false positives. And in such cases its
always a good idea to let the sender know his email didnt get through.

Like for example - you see a large webmail provider whose hosts and
domains keep getting forged into spam, misread the headers and block
that provider. In such cases, its your users who arent getting a lot
of valid email from their friends and relatives who are using that
provider, and 550'ing instead of trashing email saves the senders, and
their provider, quite lot of time that'd otherwise be spent
troubleshooting the issue.

Plus, 5xx smtp rejects tend to save your bandwidth a bit compared to
accepting the entire email (not that it matters on a small university
domain where your userbase is going to be fairly small, and bandwidth
available quite generous .. but for larger sites, or sites with
bandwidth issues, that's definitely a concern)

  --srs

If you are wrong about the message being spam, then the sender gets a
bounce.

Tony.

I haven't seen any succinct justification for providing a
550 message rejection for positively-identified spam versus
silently dropping the message. Lots of how-to instructions
but no whys.

For viruses - fine. But you are not going to find any spam filter in
the world that doesnt have false positives. And in such cases its
always a good idea to let the sender know his email didnt get through.

Agreed, but we're willing to live with an error rate of less
than one in a million. This isn't a space shuttle. I don't think
the USPS can claim 99.9999% delivery accuracy. Nonetheless, to
allay worries, we are considering spam quarantines to allow
recipients an opportunity to review spam messages themselves, much
like Yahoo! Mail.

Complaints about e-mail not getting through won't be solved
with a 550 versus silently dropping spam because most users aren't
willing to sift through e-mail errors to find the specific cause
for delivery failure. Members of this list are a rare exception.

Like for example - you see a large webmail provider whose hosts and
domains keep getting forged into spam, misread the headers and block
that provider. In such cases, its your users who arent getting a lot
of valid email from their friends and relatives who are using that
provider, and 550'ing instead of trashing email saves the senders, and
their provider, quite lot of time that'd otherwise be spent
troubleshooting the issue.

Plus, 5xx smtp rejects tend to save your bandwidth a bit compared to
accepting the entire email (not that it matters on a small university
domain where your userbase is going to be fairly small, and bandwidth
available quite generous .. but for larger sites, or sites with
bandwidth issues, that's definitely a concern)

We already reject most connections with a 550 or TCP REFUSE
based on reputation filtering and blacklists, et al.

Where is the bandwidth savings once we've accepted an entire message,
scanned it, determined it was spam, then provided a 550 rejection
versus silently droping?

matthew black
california state university, long beach

Agreed, but we're willing to live with an error rate of less
than one in a million. This isn't a space shuttle. I don't think
the USPS can claim 99.9999% delivery accuracy. Nonetheless, to

I'm not even saying five nines. Spam filtering - even with heuristics
etc - is less than perfect, and per user spam filtering, however idiot
proof, sometimes turns out to be like giving Acme Inc gadgets to Wile
E Coyote. [users having fun with procmail and .forwards should
already be a familiar story I guess?]

We already reject most connections with a 550 or TCP REFUSE
based on reputation filtering and blacklists, et al.

That works just fine. I dont have any argument with it

Where is the bandwidth savings once we've accepted an entire message,
scanned it, determined it was spam, then provided a 550 rejection
versus silently droping?

If you can scan it inline, you can stop, issue a 550 and drop the SMTP
connection any time you want. Like for example, midstream when you
discover a fake header pattern.

You'd start with whatever can be rejected in session - fake HELOs,
blocklist listed IPs, random faked headers, dodgy attachment types
that are more likely to be viruses than not

Then apply the heavier and more cpu intensive filters later, on a much
smaller volume of spam

Maybe not all that much of a bandwidth / cpu saving, but saving remote
postmasters the hassle of troubleshooting lost email is always a good
idea.

Where is the bandwidth savings once we've accepted an entire message,
scanned it, determined it was spam, then provided a 550 rejection
versus silently droping?

If you can scan it inline, you can stop, issue a 550 and drop the SMTP
connection any time you want. Like for example, midstream when you
discover a fake header pattern.

You'd start with whatever can be rejected in session - fake HELOs,
blocklist listed IPs, random faked headers, dodgy attachment types
that are more likely to be viruses than not

Then apply the heavier and more cpu intensive filters later, on a much
smaller volume of spam

We already do this.

Maybe not all that much of a bandwidth / cpu saving, but saving remote
postmasters the hassle of troubleshooting lost email is always a good
idea.

After all said methods have been performed and the message gets
through reputation filtering; blacklists; forged/munged headers,
e-mail addresses, domain names the message comes in and then
there's that final dot. Up to this point, the message hasn't
proven to be spam until it can be scanned using BrightMail,
SpamAssassin, Baysian filters, DCC lists, or other methods.
All I'm saying is that once the full DATA submission has completed,
there's no bandwidth savings from silently dropping the message
versus providing a 550 rejection. In the best of all worlds,
it would be nice to give feedback. No system is perfect and a
false-positive rate of less than one in a million "220" accepted
messages seems pretty small.

matthew black
california state university, long beach

Matthew Black wrote:

there's no bandwidth savings from silently dropping the message
versus providing a 550 rejection. In the best of all worlds,
it would be nice to give feedback. No system is perfect and a
false-positive rate of less than one in a million "220" accepted
messages seems pretty small.

I thought I had already participated in beating this dead horse sufficiently in multiple threads in multiple forums on multiple occasions. Maybe I am in your killfile or something. If I post again on this topic, I certainly will deserve to be.

Let me ask you this simple question:

If you know at close of DATA whether you are going to actually perform final delivery, what does it cost you to follow standards and issue a 550 instead of a 220 and discard it?

If you use a 550, a real live person sending an email that somehow gets FP will actually benefit.

I am with Suresh on this, just like in the past threads. Search the archive.

I haven't seen any succinct justification for providing a
550 message rejection for positively-identified spam versus
silently dropping the message. Lots of how-to instructions
but no whys.

RFC 2821?

  ...the protocol requires that a server accept responsibility
  for either delivering a message or properly reporting the
  failure to do so.

  ...

  If an SMTP server has accepted the task of relaying the mail
  and later finds that the destination is incorrect or that
  the mail cannot be delivered for some other reason, then
  it MUST construct an "undeliverable mail" notification message
  and send it to the originator of the undeliverable mail (as
  indicated by the reverse-path).

Unless you're the final recipient of the message, you have no business
deleting it. If you've accept a message, you should either deliver or
bounce it, per RFC requirements.

Elsewhere in 2821 (6.1, to be specific):

   When the receiver-SMTP accepts a piece of mail (by sending a "250 OK"
   message in response to DATA), it is accepting responsibility for
   delivering or relaying the message. It must take this responsibility
   seriously. It MUST NOT lose the message for frivolous reasons, such
   as because the host later crashes or because of a predictable
   resource shortage.

OK? Got that? You '250 OK' it, you got a *serious* responsibility. Losing the
message because the whole damned machine crashes is considered a frivolous reason.

And throwing it away because you don't like the way it looks is OK? Man,
you're in for some severe karmic protocol payback down the road... :wink:

Earlier today, I said:

Unless you're the final recipient of the message, you have no business
deleting it. If you've accept a message, you should either deliver or
bounce it, per RFC requirements.

I just want to clarify that I was in no way suggesting that anyone bounce
spam - I was merely pointing out that if you choose to 250 a message, you
have to deliver it. The much better option is to 550 it after DATA if you
don't like what you see. Silently deleting other people's e-mail should
never even be considered.

Returning to lurk status...

St-

Silently deleting other people's e-mail should never even be considered.

Unless that email is a virus, or a spam with a forged envelope sender.

-bryan bradsby

No, in that case you 550 the sucker.

Matthew Black wrote:

> there's no bandwidth savings from silently dropping the message
> versus providing a 550 rejection. In the best of all worlds,
> it would be nice to give feedback. No system is perfect and a
> false-positive rate of less than one in a million "220" accepted
> messages seems pretty small.

Let me ask you this simple question:

If you know at close of DATA whether you are going to actually perform
final delivery, what does it cost you to follow standards and issue a
550 instead of a 220 and discard it?

If you use a 550, a real live person sending an email that somehow gets
FP will actually benefit.

In today's world, at least with the spamtorrent I see at my clients,
that's just untrue. If your filtering is set up well, and you mark
an e-mail as SPAM, it almost certainly is (yes, I'll certainly concede
FP's exist, but again, it almost certainly doesn't matter that much in
that teensy number of occurrences); and 99-plus-percent of spam
is emitted from spambots who don't give a $expletive about return
status one way or another. If you're worrying about "no-status" in
the context of FP's, then your filtering isn't set up well, which really
means you've got larger problems.

I am with Suresh on this, just like in the past threads. Search the archive.

Though not contradicting what I just wrote, so am I. However, header-forged
and multi-chained spam from firehose-like spambots don't play by any of our
rules; all they do is blast away in a largely one-way transaction (guess
which direction!). A 550 now-a-days has nowhere to "go" (and those
"commercial" akak "legit") spamhouses don't wash their lists even on 550's.

> I haven't seen any succinct justification for providing a
> 550 message rejection for positively-identified spam versus
> silently dropping the message. Lots of how-to instructions
> but no whys.

RFC 2821?

  ...the protocol requires that a server accept responsibility
  for either delivering a message or properly reporting the
  failure to do so.

Your statement is open to multiple interpretations. I argue that
anytime our system identifies a message as spam that it gets
delivered to the system bit bucket.

RFC-821 and netiquette also "mandate" e-mail be properly addressed.
System manufacturers and administrators make compromises because
strict adherence to the rules is not always possible from an
operational perspective.

Elsewhere in 2821 (6.1, to be specific):

  When the receiver-SMTP accepts a piece of mail (by sending a "250 OK"
  message in response to DATA), it is accepting responsibility for
  delivering or relaying the message. It must take this responsibility
  seriously. It MUST NOT lose the message for frivolous reasons, such
  as because the host later crashes or because of a predictable
  resource shortage.

Lost me on that part about crashes being frivolous reasons.
This is a political statement not an indisputable matter of fact.

OK? Got that? You '250 OK' it, you got a *serious* responsibility. Losing the
message because the whole damned machine crashes is considered a frivolous reason.

And throwing it away because you don't like the way it looks is OK? Man,

...............................***

you're in for some severe karmic protocol payback down the road... :wink:

I'm not the one throwing them away and never look at them; watch
the finger wagging. And thanks for the karma heads up, Bhudda.

matthew black
california state university, long beach

Aha, so there are situtations where this is acceptable?
What about deleting viral attachments or altering subject
lines...is that permissible? The sweeping generalizations
I've read leave little room for responding to real-world
situations.

matthew black
california state university, long beach

Steve Thomas wrote:

Earlier today, I said:

Unless you're the final recipient of the message, you have no business
deleting it. If you've accept a message, you should either deliver or
bounce it, per RFC requirements.
   
I just want to clarify that I was in no way suggesting that anyone bounce
spam - I was merely pointing out that if you choose to 250 a message, you
have to deliver it. The much better option is to 550 it after DATA if you
don't like what you see. Silently deleting other people's e-mail should
never even be considered.

This policy I whole heartedly agree with, and I strive where ever possible to enforce this in every place I work, where ever people get listed in SORBS for backscatter, I work with them telling them how they can do this....

With the current technologies available there is no reason a small-medium organisation cannot virus and spam scan mail inline at the SMTP transaction stage. (Even the barracuda's can spamassassin scan at around 8 messages per second - my previous employment were receiving around 4 messages per second - which translated to 1-2 million emails per day)

It is possible to do inline scanning in larger ISPs (I personally have configured a 'system' to handle upto 90 message per second inline scanning) - though it requires a lot more planning, thought, and careful consideration.

Regards,

Mat