Question re prevention of enumeration with DNSSEC (NSEC3, etc.)

As I wrote:

But some spam actors
deliberately compared zone file editions to single out additions, and
then harass the owners of newly registered domains, both by e-mail and
phone.

If that is a serious concern, stop whois.

There are various ways, such as crawling the web, to enumerate
domain names.

That is not an efficient method. The gTLD zones are available on approval from the registries via ICANN's CZDS. Many of the ccTLDs do not provide such access.

For example, large companies such as google can obtain enumerated
list of all the current most active domains in the world, which
can, then, be used to access whois.

What Google might obtain would be a list of domain names with websites. The problem is that the web usage rate for TLDs varies with some ccTLDs seeing a web usage rate of over 40% (40% of domain names having developed websites) but some of the new gTLDs have web usage rates below 10%. Some of the ccTLDs have high web usage rates.

Hiding DNS zone information from public is beneficial to powerful
entities such as google.

In some respects, yes. But there is a problem with that because of all the FUD about websites linking to "bad" websites that had been pushed in the media a few years ago. That had an effect of websites no longer linking to others. Any search engine that detected new websites by following links was at a severe disadvantage. There are other methods of detecting new websites but for those without Google's resources, it is a lot more difficult without access to the zone files.

Another factor that is often missed is the renewal rate of domain names. Some of the ccTLDs have very strong renewal rates of over 70% and the .COM typically has a blended renewal rate of over 70%. A blended renewal rate is the renewal rate for all registrations (first year and multi-year). The first year renewal rates can vary considerably. Some of the new gTLDs can see upwards of 80% of the zone deleted within a year. The first year renewal rate varies considerably across ccTLDs and gTLDs and from month to month. With heavily discounted new registrations, it is not unusual to see over 90% of those discounted registations deleted when they come up for renewal.

Access to WHOIS servers and the records that they return has changed considerably since the infliction of GDPR in May 2018. Some gTLDs are moving to RDAP and will throttle the number of requests by IP or limit them to a maximum number of queries per minute. A lot of personal data such as e-mail addresses, phone numbers and even postal addresses have been removed from gTLD records because of the fear of GDPR. Some of the European ccTLD registries simply don't return any personal information for a WHOIS request. (deNIC is a good example.) Others allow registrants to opt out of public WHOIS.

One of the largest "registrations as a service" gTLD registrars (it helps resellers resell gTLD domain names without them having to become ICANN accredited registrars) unilaterally decided to go dark on WHOIS data a few years ago by removing registrant data.

The zones change. New domain names are registered and domain names are deleted. For many TLDs, the old WHOIS model of registrant name, e-mail and phone number no longer exists. And there are also WHOIS privacy services which have obscured ownership.

Regards...jmcc

I have found that some people who are concerned about such things will have LetsEncrypt certs for many of the same hosts they were worried about - which of course makes the DNS zone enumeration issue moot - any CA-signed certs are already public these days.

Doesn't make the issue completely moot, but the reality is if you're exposing something to the internet, there's plenty of ways for it to leak out, so best not to make it public to begin with.

Tangentially related today is the news that all your "private channel" names are actually completely public on Discord[1], which was also true for Slack for many years, with their security folks claiming its totally no problem that anyone can see you have a channel named secret-jv-announcing-next-month-with-company-X.

Matt

[1] https://twitter.com/joshfraser/status/1524093111349166080

John McCormac wrote:

There are various ways, such as crawling the web, to enumerate
domain names.

That is not an efficient method.

Not a problem for large companies or botnet. So, only
small legal players suffer from hiding zone information.

For example, large companies such as google can obtain enumerated
list of all the current most active domains in the world, which
can, then, be used to access whois.

What Google might obtain would be a list of domain names with websites. The problem is that the web usage rate for TLDs varies with some ccTLDs seeing a web usage rate of over 40% (40% of domain names having developed websites) but some of the new gTLDs have web usage rates below 10%. Some of the ccTLDs have high web usage rates.

You misunderstand my statement. Domain names not offering
HTTP service can also be collected by web crawling.

Hiding DNS zone information from public is beneficial to powerful
entities such as google.

In some respects, yes.

Google can also use gmail to collect domain names used by
sent or received e-mails.

But there is a problem with that because of all the FUD about websites linking to "bad" websites that had been pushed in the media a few years ago.

Is your concern privacy of "bad" websites?

Another factor that is often missed is the renewal rate of domain names.

That's not a problem related to enumeration of domain names.

A lot of personal data such as e-mail addresses, phone numbers and even postal addresses have been removed from gTLD records because of the fear of GDPR.

As I have been saying, the problem, *if+ *any*, is whois. So?

The zones change. New domain names are registered and domain names are deleted. For many TLDs, the old WHOIS model of registrant name, e-mail and phone number no longer exists. And there are also WHOIS privacy services which have obscured ownership.

As I wrote:

: Moreover, because making ownership information of lands and
: domain names publicly available promotes public well fair
: and domain name owners approve publication of such
: information in advance, there shouldn't be any concern
: of privacy breach forbidden by local law of DE.

that is not a healthy movement.

            Masataka Ohta

John McCormac wrote:

There are various ways, such as crawling the web, to enumerate
domain names.

That is not an efficient method.

Not a problem for large companies or botnet. So, only
small legal players suffer from hiding zone information.

Agree on the effects on smaller legal players.

A domain name does not always have to have a website. This means that some domain names may have no presence on the Web unless they are mentioned on a site or in e-mail. With the increased automation of webhosting control panels, undeveloped domain names may be automatically parked on the webhoster's or registrar's holding page.

You misunderstand my statement. Domain names not offering
HTTP service can also be collected by web crawling.

Perhaps if there are lists of new registrations published or the domain names are reregistrations that had been previously deleted. Some might be detected if they have reverse DNS set up for the domain name. DNS traffic could be another source. Other than those cases, I am not sure about web crawling detecting domain names without HTTP service.

Google can also use gmail to collect domain names used by
sent or received e-mails.

Or even Google Analytics but that may have legal issues over privacy.

But there is a problem with that because of all the FUD about websites linking to "bad" websites that had been pushed in the media a few years ago.

Is your concern privacy of "bad" websites?

No. The problem for search engines and other crawlers that detect new websites by crawling links from others are at a disadvantage because of websites being less likely to link to others due to search engine optimisation. The decline of web directories has also had an effect. It becomes increasingly difficult for newer players without the resources of Google or Microsoft to compete at detecting new websites, typically ccTLD, when they have no inbound links from other websites.

Another factor that is often missed is the renewal rate of domain names.

That's not a problem related to enumeration of domain names.

It is when millions of (gTLD and ccTLD) domain names per month are deleted. Even after a run of enumerating domain names in a zone, some of those domain names will have been deleted before the process is completed. Enumerating domain names is very much a continual process rather than a one-off process. The set of domain names in a zone is rarely a static one. An enumerated zone is a snapshot of that zone at a particular time. It becomes increasingly unreliable.

A lot of personal data such as e-mail addresses, phone numbers and even postal addresses have been removed from gTLD records because of the fear of GDPR.

As I have been saying, the problem, *if+ *any*, is whois. So?

There are multiple issues. The redaction of WHOIS data has made dealing with fradulent/malware/phishing sites more difficult. It can also cause problems for registrants who have registered their domain name through a reseller that has disappeared.

Spammers using WHOIS data from new registrations to target registrants has declined somewhat since 2018. The redaction of data from the WHOIS is not a one-size-fits-all solution. This is why ICANN is moving towards RDAP and a more controlled access to registrant data.

The zones change. New domain names are registered and domain names are deleted. For many TLDs, the old WHOIS model of registrant name, e-mail and phone number no longer exists. And there are also WHOIS privacy services which have obscured ownership.

As I wrote:

: Moreover, because making ownership information of lands and
: domain names publicly available promotes public well fair
: and domain name owners approve publication of such
: information in advance, there shouldn't be any concern
: of privacy breach forbidden by local law of DE.

that is not a healthy movement.

There has been some discussion about using a Natural Person or Legal Person field in gTLD WHOIS records with the Legal Person (effectively a business or company) having more information published. There are multiple jurisdictions and some have different protections for data. Some registrars and registries allow registrants to publish ownership details but others do not. With gTLDs, there is a central organisation (ICANN). With ccTLDs, each ccTLD registry is almost unique (a few registries also run IDN versions of ccTLDs in addition to their main ccTLD) and subject to the local laws of its country. GDPR has caused a lot of problems inside and outside of the EU.

Regards...jmcc

11.05.22 15:31, Masataka Ohta пише:

As I wrote:

But some spam actors
deliberately compared zone file editions to single out additions, and
then harass the owners of newly registered domains, both by e-mail and
phone.

If that is a serious concern, stop whois.

There are various ways, such as crawling the web, to enumerate
domain names.

Come on, web is dying! People are moving to mobile applications!
So more and more domains do not need any web site by design.

Max,