Is Hotmail in the habit of ignoring MX records?

William_Herrin · July 30, 2012, 1:46am

In message <B59A4092-CE2F-44E4-84F9-77C18493AD95@kapu.net>, Michael J Wise writ
es:

And maybe an endless loop for an MX lookup might be what is causing =
hotmail to panic and throw out the MX records.

You don't lookup MX records for MX targets. This is basic MTA
processing.

Correct. An MX record points to a label containing one or more address
records. It does not chain. In principle the MX record could point to
a CNAME record which then chains until it reaches an address record
but I wouldn't depend on such a configuration working correctly. Ditto
the MX lookup fetching a CNAME which chains until it reaches a label
with an MX record.

You don't depend on ALL (ANY) returning MX records as they may not
be in the cache. You need to make a explict MX query you get no
MX records are returned in response to a ALL query.

Also correct.

If the MX lookup fails, as apposed to returns nodata, you don't
lookup the A/AAAA records and synthesis a MX record. You treat it
as a soft error and queue for retry later. Again this is basic MTA
processing.

Maybe. In principle this is correct but as you wander through various
bits of software in the name lookup process (which often consults more
than just the DNS -- even today DNS isn't the only game in town) it's
pretty easy to lose track of the difference between lookup failure and
success:no data.

Think about it... how is the MTA to respond if the primary lookup
reports success:no data (e.g. /etc/hosts) but a second tier lookup
(e.g. DNS) reports lookup failure? What if DNS is third tier and the
second tier is some kind of CIFS or NIS lookup which fails? Or reports
success:no data. Or the DNS gets translated through a middleman (like
NIS) which doesn't preserve the difference between fail and success no
data. Does the whole lookup fail because part did? Gets ambiguous.

Further, falling back to the address lookup in the absence of MX
records is correct behavior for an MTA.

What *should* happen here is that the guy's web server should reject
the port 25 connection (an SMTP soft fail condition) and on the next
retry hotmail should find the MX record and follow it.

Either way, I think I'd have to consider this -advanced- MTA
processing. You have to really know your stuff to get this one right.

Regards,
Bill Herrin

Mark_Andrews2 · July 30, 2012, 5:03pm

> In message <B59A4092-CE2F-44E4-84F9-77C18493AD95@kapu.net>, Michael J Wise writ
> es:
>> And maybe an endless loop for an MX lookup might be what is causing =
>> hotmail to panic and throw out the MX records.
>
> You don't lookup MX records for MX targets. This is basic MTA
> processing.

Correct. An MX record points to a label containing one or more address
records. It does not chain. In principle the MX record could point to
a CNAME record which then chains until it reaches an address record
but I wouldn't depend on such a configuration working correctly. Ditto
the MX lookup fetching a CNAME which chains until it reaches a label
with an MX record.

> You don't depend on ALL (ANY) returning MX records as they may not
> be in the cache. You need to make a explict MX query you get no
> MX records are returned in response to a ALL query.

Also correct.

> If the MX lookup fails, as apposed to returns nodata, you don't
> lookup the A/AAAA records and synthesis a MX record. You treat it
> as a soft error and queue for retry later. Again this is basic MTA
> processing.

Maybe. In principle this is correct but as you wander through various
bits of software in the name lookup process (which often consults more
than just the DNS -- even today DNS isn't the only game in town) it's
pretty easy to lose track of the difference between lookup failure and
success:no data.

But it is the only ones that returns MX records. If that step
errors you need to retry later. If you get NXDOMAIN you go onto
other address sources.

Think about it... how is the MTA to respond if the primary lookup
reports success:no data (e.g. /etc/hosts) but a second tier lookup
(e.g. DNS) reports lookup failure? What if DNS is third tier and the
second tier is some kind of CIFS or NIS lookup which fails?

MX records can't be lookup up in /etc/hosts or in CIFS / NIS. You
only look for address records *after* the MX lookup fails.

Or reports
success:no data. Or the DNS gets translated through a middleman (like
NIS) which doesn't preserve the difference between fail and success no
data. Does the whole lookup fail because part did? Gets ambiguous.

Further, falling back to the address lookup in the absence of MX
records is correct behavior for an MTA.

The key words above are "in the absence". Until you have determined
that they are absent you don't fall back.

What *should* happen here is that the guy's web server should reject
the port 25 connection (an SMTP soft fail condition) and on the next
retry hotmail should find the MX record and follow it.

No. It is perfectly legal for A to accept mail for B, B for C, C
for D and D for A with all mail being delivered to a host with a
different name than the mail domain. It is not and never has been
correct processing to lookup addresses records for a domain if the
MX lookup fails. nodata/nxdomain are not failures.

Either way, I think I'd have to consider this -advanced- MTA
processing. You have to really know your stuff to get this one right.

No. This is the behaviour you get with a MX oblivious MTA.

William_Herrin · July 30, 2012, 8:07pm

Hi Mark,

If you can reference where in the SMTP RFC it offers an authoritative
explanation what to do when merging results from various naming
systems where one but not all of the naming systems has generated an
error then let's read it. If not... your common sense says one thing,
mine says another and folks implementing mail systems should be aware
the implications.

Until then, my view is that a lookup failure when seeking an MX record
should only block the MTA from seeking an address record in the DNS.
It should still seek an address record in higher priority naming
systems and use it if it finds one. If correct, and I think it is,
that's a pretty subtle thing to program for... something easily gotten
wrong.

Regards,
Bill Herrin

Valdis_Kletnieks · July 30, 2012, 8:26pm

RFC5321, section 5.1 is pretty clear on it:

5.1. Locating the Target Host

   Once an SMTP client lexically identifies a domain to which mail will
   be delivered for processing (as described in Sections 2.3.5 and 3.6),
   a DNS lookup MUST be performed to resolve the domain name (RFC 1035
   [2]). The names are expected to be fully-qualified domain names
   (FQDNs): mechanisms for inferring FQDNs from partial names or local
   aliases are outside of this specification.

The Internet uses DNS. You use some other scheme at your own peril,
and probably shouldn't expect said other scheme to work outside the
range of your administrative control.

James_Hess · July 31, 2012, 2:27am

Aside from that RFC974 [Page 3] gives mailers significant leeway in
deciding how to handle errors:

" Mailers are expected to do something reasonable in the face of an
error. The behaviour for each type of error is not specified here,
but implementors should note that different types of errors should
probably be treated differently. "

Attempting to find another path for an apparently unroutable message
(all MX offline) is not entirely out of the question. You may not
assume that such measures will not be attempted, if anyone could
consider it a 'reasonable' error handling procedure.

I will echo that; go back to the robustness principal of being
liberal in what you accept.... You should either not listen on port
25, or you should not create that A record pointing to a mail
server that won't actually accept mail.

When "yourdomain.example.com" has an A record, all the services
listening on that address are services for the domain.

"Relay not allowed" to the same domain may be considered
nonsensical, and a mailer converting its error recovery attempt into
a permanent error at that point, may be reasoned.

Tony_Finch · August 2, 2012, 4:04pm

The relevant spec is RFC 5321 section 5.

Tony.

William_Herrin · August 2, 2012, 4:20pm

If you can reference where in the SMTP RFC it offers an authoritative
explanation what to do when merging results from various naming
systems where one but not all of the naming systems has generated an
error then let's read it.

RFC5321, section 5.1 is pretty clear on it:

5.1. Locating the Target Host

   Once an SMTP client lexically identifies a domain to which mail will
   be delivered for processing (as described in Sections 2.3.5 and 3.6),
   a DNS lookup MUST be performed to resolve the domain name (RFC 1035
   [2]). The names are expected to be fully-qualified domain names
   (FQDNs): mechanisms for inferring FQDNs from partial names or local
   aliases are outside of this specification.

Well there you have it. Mechanisms for determining whether a name is
intended to be acquired from the DNS are _outside the scope of the
RFC_. So, the specifics of merging results from multiple naming
systems is left to the implementer without IETF guidance. Clear as
mud. And when the RFC goes on to say:

If an empty list of MXs is returned,
the address is treated as if it was associated with an implicit MX
RR, with a preference of 0, pointing to that host.

one reasonable interpretation is that MX-type lookups only apply to
lookups from the DNS system, another is that both the MX lookup and
host lookup have to come from the same naming system, and a third
reasonable interpretation is that they do not. And that's true even
though the RFC also says:

If a temporary error is returned, the message MUST be queued
and retried later

because the MTA *did* successfully acquire the _implicit MX_ from one
of the name systems it uses.

Chalk this up as a point that "needs work" in the next XX21 RFC.

The Internet uses DNS. You use some other scheme at your own peril,
and probably shouldn't expect said other scheme to work outside the
range of your administrative control.

"The Internet" uses a far broader set of technologies than you give it
credit for. And it routinely uses the DNS badly. Structure your
systems with that understanding or pay for your negligence with
malfunction.

Regards,
Bill Herrin