SMTP no-such-user issues

Hi everyone,

We are experiencing an issue in regards to SMTP MTA relay responses regarding 'no such user', and it *apparently* appears to be only occurring when a particular site attempts to deliver email to us. Any advice on how to further troubleshoot my issue would be greatly appreciated.

I want to keep this as brief as possible...please bear with me. We have a cluster of Barracuda Spam Firewalls operating as an SMTP server/client to a cluster of Qmail MTA/MDA's (with CHKUSR in place).

In short, a user from Sympatico (which now delivers via Hotmail) sends an email to a list of addresses at our domain in which only a single one of those addresses is bad, a response is sent back to the Sympatico client advising that ALL the destination addresses were bad, which is not the case. It doesn't matter whether the recipients were listed in the RCPT field, or within the DATA field as Cc: or Bcc:. This is a nightmare on the helpdesk.

When I attempt to simulate the issue from ANY other mail system that I've tested (my own personal, GMail etc, etc), the legitimate addresses receive the message, and the relay agent in question in the test scenario sends back a 551 for ONLY the invalid recip, as opposed to including even the valid recips.

Although it has not been yet completely verified via logs that the client SMTP server sends a QUIT before verifying all of the addresses, it certainly appears thus far that all of the addresses are put through the standard verification stages prior to DATA and then gives up, even if the invalid recipient is the last one passed to our system.

Can anyone else provide advise on how to further troubleshoot this, or for those who live in my part of North America (Toronto), test it for themselves who have MTA bounce-no-recip in place, that have access to the Hotmail mail system (I've confirmed the exact same problem exists (the 551 is very near identical) from a web-based Hotmail address as well as Sympatico).

Examples of logs and bounces can be provided if necessary.

Thanks,

Steve

Steve Bertrand wrote:

Hi everyone,

We are experiencing an issue in regards to SMTP MTA relay responses regarding 'no such user', and it *apparently* appears to be only occurring when a particular site attempts to deliver email to us. Any advice on how to further troubleshoot my issue would be greatly appreciated.

Because I have been inundated with off-list replies, I'll post my findings here.

First, a screen cap of me composing a message, in which two of the addresses are easily identified as valid, and two are clearly not:

http://ibctech.ca/nanog_smtp/hotmail_compose.jpg

...when I sent it, I received thus:

http://ibctech.ca/nanog_smtp/hotmail_response.txt

So, we'll try it from a different server:

http://ibctech.ca/nanog_smtp/local_compose.jpg

...hmmm, I received the message at steveb@eagle.ca (even though the To: field was cropped out of the jpg) this time, and my server generated the following....AFAICT, correct 551 error(s):

http://ibctech.ca/nanog_smtp/local_err.txt

I'm willing to provide any more documentation required in order to get this issue sorted.

Thank you everyone, I truly appreciate it.

Steve

Please share a packet capture of a working and not working SMTP exchange.

Frank

Frank Bulk - iNAME wrote:

Please share a packet capture of a working and not working SMTP exchange.

In order to provide the highest amount of clarity, could you recommend a specific set of tcpdump command line args that I should use?

Steve

Once you've performed a full capture on port 25, Wireshark does a nice job
of providing an option to extract the relevant conversation by
right-clicking on just one packet in that conversation and choosing
something called "Follow the TCP stream", I believe.

Frank

Frank Bulk - iNAME wrote:

Once you've performed a full capture on port 25, Wireshark does a nice job
of providing an option to extract the relevant conversation by
right-clicking on just one packet in that conversation and choosing
something called "Follow the TCP stream", I believe.

Ok. I've never captured in tcpdump and then imported into Wireshark before, but I'll do some tests, scp the file to my Windows workstation, then follow the stream.

Once I ensure I get a clean stream, I'll post the results.

Steve

Steve Bertrand wrote:

Frank Bulk - iNAME wrote:

Once you've performed a full capture on port 25, Wireshark does a nice job
of providing an option to extract the relevant conversation by
right-clicking on just one packet in that conversation and choosing
something called "Follow the TCP stream", I believe.

Ok. I've never captured in tcpdump and then imported into Wireshark before, but I'll do some tests, scp the file to my Windows workstation, then follow the stream.

Once I ensure I get a clean stream, I'll post the results.

As I research the documentation on the how-to specifics on capturing with tcpdump in a format that is Wireshark compatible, is there anyone here that could perform a simple test against their own domain email system, that can confirm or deny what I have been witnessing?

If it can be confirmed that either A) my end is broken, or B) a remote end is broken, I will be content, and can continue with other work.

My mind will rest at ease if someone, with known bounce-no-mbox enabled, can:

- provide me off list (or test for themselves from a remote location) a list of valid, and invalid recipients within their own domain's email infrastructure. It doesn't even matter if you specify which are valid and which ones are not

- create a temporary account on Hotmail (or from a sympatico.ca email address, using whatever outbound servers they specify) send a message to the same recipients as requested above.

- in the case that you don't want to provide the addresses, and want to test internally, inform me of the overall result

- in the case that I receive the addresses to test from my location, provide me with the results of the Hotmail test so I can compare results

If this is happening to other ops along with myself, I can justify it to my users, and I can justify it in my own mind. If this is a locale specific issue to my own network, then I need to know that, as I obviously have work to do.

Thanks to everyone again.

Steve

Wireshark reads pcap files. Spit them out with this option on the tcpdump commandline.

-w file

Nathan Ward wrote:

Steve Bertrand wrote:

Frank Bulk - iNAME wrote:

Once you've performed a full capture on port 25, Wireshark does a nice job
of providing an option to extract the relevant conversation by
right-clicking on just one packet in that conversation and choosing
something called "Follow the TCP stream", I believe.

Ok. I've never captured in tcpdump and then imported into Wireshark before, but I'll do some tests, scp the file to my Windows workstation, then follow the stream.
Once I ensure I get a clean stream, I'll post the results.

As I research the documentation on the how-to specifics on capturing with tcpdump in a format that is Wireshark compatible, is there anyone here that could perform a simple test against their own domain email system, that can confirm or deny what I have been witnessing?

Wireshark reads pcap files. Spit them out with this option on the tcpdump commandline.

I'm capturing this now.

In the meantime, I had assistance off-list from someone within an external domain, and we confirmed that the problem is NOT solely Hotmail, yet it is not solely my end (at least I'm not completely convinced).

I feel quite a bit more relaxed now, although the problem is not resolved.

Hotmail encompassed domains are the only site that we have noticed this problem with, however, now I'm certain that there could be more. Most are confirmed to work properly, most notably GMail.

It is also not solely related to the Barracuda. Another SMTP server is experiencing the same issue within the same network, which is not located behind the 'cuda cluster. The only common ground is that both environments operate under Qmail. The 'cuda setup with no filtering, and the non-cuda setup with SA, ClamAV being called by Simscan.

We're back to square one, but now I know to point squarely at my configuration to find out why this is happening.

My sincerest regards for all of the on and off-list help that I have received in regards to this issue. I have learned a tremendous amount along the way.

Thank you to everyone who has provided the patience and willingness to help, and those that are continuing to do so.

If it does turn out to be an implementation issue with any of the software chain we have operating here, we will attempt with our best efforts to document it, and provide patches to the original source.

Steve

Shane Short wrote:

are you using vpopmail with your qmail install? (I can't seem to load your errors again, I recall they were chkuser failures)

Yes, vpopmail against MySQL.

I've had this problem before when I've run out of MySQL connections and vchkuser was then failing.

Thanks for the feedback, but this is not the case.

The problem has been identified to be extraordinarily site specific. I'm attempting to follow the recommendations of members that mailed me off list one at a time, and am proceeding with testing.

It would be of great value if other mail admins who are running Qmail that have a few minutes could contact me off list in order to compare some quick test results/config setups.

Steve

Steve Bertrand wrote:

Shane Short wrote:

are you using vpopmail with your qmail install? (I can't seem to load your errors again, I recall they were chkuser failures)

Yes, vpopmail against MySQL.

I've had this problem before when I've run out of MySQL connections and vchkuser was then failing.

Thanks for the feedback, but this is not the case.

The problem has been identified to be extraordinarily site specific. I'm attempting to follow the recommendations of members that mailed me off list one at a time, and am proceeding with testing.

It would be of great value if other mail admins who are running Qmail that have a few minutes could contact me off list in order to compare some quick test results/config setups.

Well, in order to draw a conclusion (not yet tested), someone mentioned off-list to look at the extended function PIPELINING.

After reviewing this possibility, it directly fits the bill in regards to the issues I have been experiencing. I will disable it just as a pure test, however, after scouring the web to weigh the benefits to drawbacks, I've found a pretty general consensus that disabling pipelining because of a single poorly-implemented client will effectively erase any performance enhancement that well-behaved clients can achieve with it.

If this does turn out to be the problem, then the users, whom are out of my control, of the single site who are witnessing this problem that appears to originate from my end can deal with sorting through the apparently 'faulty' list of addresses they receive in the bounce.

Does this theory sound reasonable, and is keeping pipelining in place a recommended option?

Steve

Steve Bertrand wrote:

Hi everyone,

We are experiencing an issue in regards to SMTP MTA relay responses regarding 'no such user', and it *apparently* appears to be only occurring when a particular site attempts to deliver email to us.

For the sake of completeness...

The problem has been found within the defining of a variable in chkuser:

"But I found the problem. chkuser_settings.h shows:
#define CHKUSER_NORCPT_STRING "511 sorry, no mailbox here by that name (#5.1.1 - chkuser)\r\n"

I changed the 511 to 550 (as shown here RFC 821 - Simple Mail Transfer Protocol (RFC821) )"

I'm also told that version 2.09 of chkuser works around this problem.

For those who have recommended Postfix, I'd love to switch, however Qmail is tied so tightly into my mail infrastructure at this point that I don't think it would be possible without months and months of planning, and redeveloping a whole lot of internal management software.

Thanks everyone,

Steve