QWEST.NET can you fix your nameservers

In case anyone is wondering why I've been harping on about EDNS
compliance this is why. Failure to follow the protocol can result
in DNS lookup failures. nara.gov is signed and the recursive server
performs DNSSEC validation and sends queries with DNS COOKIEs.

BADVERS is NOT a valid response to a EDNS version 0 query.

Can you please contact your DNS vendor for a fix.

QWEST isn't the only DNS provider that has broken nameservers. One
shouldn't have to try and contact every DNS operator to get them to
use protocol compliant servers.

Mark

;; BADCOOKIE, retrying.

; <<>> DiG 9.11.0rc1 <<>> nara.gov
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5744
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 85faf1e39a1a6a149bebd00a57da4b266b8546c1b75015db (good)
;; QUESTION SECTION:
;nara.gov. IN A

;; Query time: 5000 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Sep 15 17:17:58 EST 2016
;; MSG SIZE rcvd: 65

Checking: 'nara.gov' as at 2016-09-15T07:16:32Z

nara.gov @63.150.72.5 (sauthns1.qwest.net.): dns=ok edns=ok edns1=ok edns@512=ok ednsopt=badvers,nosoa edns1opt=ok do=nodo ednsflags=ok edns@512tcp=ok optlist=badvers,nosoa
nara.gov @2001:428::7 (sauthns1.qwest.net.): dns=ok edns=ok edns1=ok edns@512=ok ednsopt=badvers,nosoa edns1opt=ok do=nodo ednsflags=ok edns@512tcp=ok optlist=badvers,nosoa
nara.gov @208.44.130.121 (sauthns2.qwest.net.): dns=ok edns=ok edns1=ok edns@512=ok ednsopt=badvers,nosoa edns1opt=ok do=nodo ednsflags=ok edns@512tcp=ok optlist=badvers,nosoa
nara.gov @2001:428::8 (sauthns2.qwest.net.): dns=ok edns=ok edns1=ok edns@512=ok ednsopt=badvers,nosoa edns1opt=ok do=nodo ednsflags=ok edns@512tcp=ok optlist=badvers,nosoa
The Following Tests Failed

EDNS - Unknown Option Handling (ednsopt)

dig +nocookie +norec +noad +ednsopt=100 soa zone @server
expect: SOA
expect: NOERROR
expect: OPT record with version set to 0
expect: that the option will not be present in response
See RFC6891, 6.1.2 Wire Format

EDNS - DO=1 (do)

dig +nocookie +norec +noad +dnssec soa zone @server
expect: SOA
expect: NOERROR
expect: OPT record with version set to 0
expect: DO flag in response if RRSIG is present in response
See RFC3225

EDNS - Supported Options Probe (optlist)

dig +edns +noad +norec +nsid +subnet=0.0.0.0/0 +expire +cookie -q zone @server
expect: NOERROR
expect: OPT record with version set to 0
See RFC6891

Codes

ok - test passed.
nodo - EDNS DO flag not echoed.
nosoa - SOA record not found when expected.
badvers - BADVERS returned.
To retrieve this report in the future: https://ednscomp.isc.org/ednscomp/25f2ebe619

Save yourself some time. Contact the DNS software vendors. :wink:

-A

I'd bet he already has. This looks like a name-and-shame to me, and
probably deserved.

-Bill

Often, the vendor *is* on the ball, but the customer isn't.

Remember that Windows XP didn't enable IPv6 by default, and *still* has some 10%
market share.

Yeah, I'm still fighting that battle.

https://goo.gl/photos/xFguK4FL2iydnLhE7

-A

On Thu, Sep 15, 2016 at 12:22 PM, Aaron C. de Bruyn <aaron@heyaaron.com> wrot
e:
>> QWEST isn't the only DNS provider that has broken nameservers. One
>> shouldn't have to try and contact every DNS operator to get them to
>> use protocol compliant servers.
>
> Save yourself some time. Contact the DNS software vendors. :wink:

I'd bet he already has. This looks like a name-and-shame to me, and
probably deserved.

-Bill

Aaron,
       How am I supposed to know which DNS vendor to contact? DNS
server fingerprinting is not a exact science. After that I then
still need to work out how to contact every operator of a broken
server and get them to contact the DNS vendor to get a fix. And
by the way the SOA RNAME is often a blackhole or it bounces or it
is syntactically invalid.

The best way to get this fixed would be for nameservers to be checked
for protocol compliance, by the parent zone operators or their
proxies regularly. That the child zone operator be given a short
(< 3 months) to fix it then all zones with that server get removed
from the parent zone until the server is fixed (apply the final
step in the complaints proceedures from RFC 1033) which forces the
owner of the zone to fix the server or to move to someone who follows
the protocol. The servers for new delegations be checked immediately
and the delegation not proceed unless the delegated servers are
protocol compliant.

Everybody seems to think they know how to write a DNS server. The
problem is that most people don't test anything other than simple
queries and that includes many of the DNS vendors. Think about all
the load balancer vendors that don't handle anything but a A query
or only handle A and AAAA queries don't handle DNSKEY queries.
There really is no excuse to not handle non-meta qtypes properly
(no error not data or name error depending upon whether the name
exists or not).

My bet is the DNS vendor has issued a update already and that it
hasn't been applied. If not Qwest can inform them that their product
is broken. Fixing this should be about 10 minutes for the DNS
vendor then QA.

If you (collectively) haven't already checked your servers go to
https://ednscomp.isc.org and check your servers. While you are
there look at some of the reports.

If there are any tech reporters out there can you report on the
issue of non compliance in DNS servers and that it can lead to
lookups failing. This issue affects everybody.

Mark

Aaron,
       How am I supposed to know which DNS vendor to contact? DNS

Sorry--I should have added a /sarcasm tag. :slight_smile:

The best way to get this fixed would be for nameservers to be checked
for protocol compliance, by the parent zone operators or their
proxies regularly. That the child zone operator be given a short
(< 3 months) to fix it then all zones with that server get removed
from the parent zone until the server is fixed (apply the final
step in the complaints proceedures from RFC 1033) which forces the
owner of the zone to fix the server or to move to someone who follows
the protocol. The servers for new delegations be checked immediately
and the delegation not proceed unless the delegated servers are
protocol compliant.

Seems a bit harsh, but I'm new to the conversation. What is being out of
compliance actually hurting other than the nameserver operator and the
zones they host?

My bet is the DNS vendor has issued a update already and that it
hasn't been applied. If not Qwest can inform them that their product
is broken. Fixing this should be about 10 minutes for the DNS
vendor then QA.

Yeah, but the business upgrade cycles are the killer.
Why dedicate resources to fix it unless there's a pretty clear
line-of-sight to lost profits?
That's why so many of my clients refuse to upgrade away from XP. It still
works for what they basically need, and it's not really impacting their
profit in a way the CFO can directly see. (i.e. he doesn't see people like
me who will walk out of a dental office and never come back when I see a
2-plus-year-out-of-date XP machine handling patient information.)

I'm sure the same is happening in a large bureaucracy like Qwest.

Maybe you're right with a harsher penalty. Be standards compliant or
you'll get a warning, then be cut off.

If you (collectively) haven't already checked your servers go to
https://ednscomp.isc.org and check your servers. While you are
there look at some of the reports.

Tested. I'm compliant. I definitely think more comprehensive tools that
are easily accessible to admins and CFOs would help.

For example, when I explain various zone-related things to CFOs, I'll use
http://intodns.com/. It's sorta flashy, and contains some sorta helpful
information that a CFO can sorta understand.

And a big red 'X' when someone is wrong.

Unfortunately it doesn't do DNSSEC. For that, there's another tool.
...and if you want EDNS testing, there's your tool.

A tool that tests compliance for everything and spits out errors, warnings,
and recommendations might go a long ways towards getting people to solve
the problem.

Just my $0.02.

Nice graphs by the way.

-A

That's interesting.

heyaaron.com is one big huge catch-all that funnels into my Google Apps for
Domains mailbox.

There's one account, it has a good password, and it's protected by a Ubikey.

I'd be interested in seeing a copy of the headers from that e-mail.

-A

>
> Aaron,
> How am I supposed to know which DNS vendor to contact? DNS
>

Sorry--I should have added a /sarcasm tag. :slight_smile:

> The best way to get this fixed would be for nameservers to be checked
> for protocol compliance, by the parent zone operators or their
> proxies regularly. That the child zone operator be given a short
> (< 3 months) to fix it then all zones with that server get removed
> from the parent zone until the server is fixed (apply the final
> step in the complaints proceedures from RFC 1033) which forces the
> owner of the zone to fix the server or to move to someone who follows
> the protocol. The servers for new delegations be checked immediately
> and the delegation not proceed unless the delegated servers are
> protocol compliant.
>

Seems a bit harsh, but I'm new to the conversation. What is being out of
compliance actually hurting other than the nameserver operator and the
zones they host?

So your helpdesks don't get problem reports when people can't look
up domain names? Recursive DNS vendors don't get bug reports when
domain names can't be looked up. We don't get fixes developed
because there are too many broken servers out there.

Because some servers don't answer EDNS requests this leads to false
positives on servers not support EDNS when they do. This in turn
leads to DNSSEC validation failures as you don't get DNSSEC answers
without EDNS.

IPv6 deployment was put back years because AAAA DNS lookups got
wrong answers.

DANE deployment is slow because DNS servers give bad answers to
_<port>._tcp.<server-name>/TLSA.

Then there is SPF. A fare portion of the reason why the SPF record
failed, despite it being architectually cleaner than using TXT
records, is that some nameservers gave bad responses to SPF queries.

I could go find more examples of the cost of non DNS protocol
compliance.

Ironically, I always wondered why I was told not to publish SPF records, since it did make more sense to have both, and slowly remove the TXT records later. Thanks for the heads up…

What do you think really is best practice now?

Sincerely,

Eric Tykwinski
TrueNet, Inc.
P: 610-429-8300

Hi Mark,

I'm going to stop you there. The SPF record type failed because
resolvers can't pass requests between clients and authoritative
servers unless they can parse them. New DNS record types essentially
require a universal software upgrade before they work. And universal
software upgrades do not happen well or in anything approaching a
timely manner. The next new DNS record type will fail for the same
reason. And the one after that.

TXT is the DNS record type that's extensible without a software
upgrade. Like it or lump it.

Regards,
Bill Herrin

In message <9442FCB1-E039-4EDD-8A0F-F5F351BC99B4@truenet.com>, Eric Tykwinski w
rites:

Ironically, I always wondered why I was told not to publish SPF records,
since it did make more sense to have both, and slowly remove the TXT
records later. Thanks for the heads up…

What do you think really is best practice now?

For SPF the decision was made to stay with TXT.

The IAB wrote RFC 5507 - Design Choices When Expanding the DNS.

As for testing there is:

https://tools.ietf.org/html/draft-ietf-dnsop-no-response-issue-04

There is general consensus that the tests are correct to the level
that they cover. There can always be more tests added. There is
less consensus on how we get from where we are now to where we need
to be.

The EDNS tests tool was the starting point for this draft.

Mark

> Then there is SPF. A fare portion of the reason why the SPF record
> failed, despite it being architectually cleaner than using TXT
> records, is that some nameservers gave bad responses to SPF queries.

Hi Mark,

I'm going to stop you there. The SPF record type failed because
resolvers can't pass requests between clients and authoritative
servers unless they can parse them. New DNS record types essentially
require a universal software upgrade before they work. And universal
software upgrades do not happen well or in anything approaching a
timely manner. The next new DNS record type will fail for the same
reason. And the one after that.

Again lack of DNS compliance. Go read STD 13 then tell me that
Microsoft ships a standards compliant resolver. They still don't
last time I checked.

Libresolv could look up any <qname,qtype,class> tuple from back
when the UCB developed it.

You *never* needed universal support for new types. It is a myth.
You just need the authoritative servers to support the type and the
client to support the type. Everything else treated it as a opaque
blob. That is why compression pointers were only allowed for well
known types. That is why records have a length field. What STD
13 missed was a presentation format for unknown types.

There were implementations that got that wrong, including named,
but we fixed that well before SPF ever became a issue.

We have RFC 3597 which allows authoritative servers to load and
save records they don't know the internal structure of. This was
published in September 2003. All the major DNS vendors support it.
This addressed the oversight in STD 13.

You have lazy operators that haven't designed their web tools to
support RFC 3597.

Mark

$ fpdns sauthns1.qwest.net.
fingerprint (sauthns1.qwest.net., 63.150.72.5): NLnetLabs NSD 3.1.0 -- 3.2.8 [New Rules]
fingerprint (sauthns1.qwest.net., 2001:428:0:0:0:0:0:7): NLnetLabs NSD 3.1.0 -- 3.2.8 [New Rules]
$ dig +nocookie +noall +answer version.bind ch txt @sauthns1.qwest.net.
version.bind. 0 CH TXT "3.2.2"

https://www.nlnetlabs.nl/projects/nsd/
NSD 3.2.2 - May 18, 2009

Tony.