As I wrote:
But some spam actors
deliberately compared zone file editions to single out additions, and
then harass the owners of newly registered domains, both by e-mail and
If that is a serious concern, stop whois.
There are various ways, such as crawling the web, to enumerate
That is not an efficient method. The gTLD zones are available on approval from the registries via ICANN's CZDS. Many of the ccTLDs do not provide such access.
For example, large companies such as google can obtain enumerated
list of all the current most active domains in the world, which
can, then, be used to access whois.
What Google might obtain would be a list of domain names with websites. The problem is that the web usage rate for TLDs varies with some ccTLDs seeing a web usage rate of over 40% (40% of domain names having developed websites) but some of the new gTLDs have web usage rates below 10%. Some of the ccTLDs have high web usage rates.
Hiding DNS zone information from public is beneficial to powerful
entities such as google.
In some respects, yes. But there is a problem with that because of all the FUD about websites linking to "bad" websites that had been pushed in the media a few years ago. That had an effect of websites no longer linking to others. Any search engine that detected new websites by following links was at a severe disadvantage. There are other methods of detecting new websites but for those without Google's resources, it is a lot more difficult without access to the zone files.
Another factor that is often missed is the renewal rate of domain names. Some of the ccTLDs have very strong renewal rates of over 70% and the .COM typically has a blended renewal rate of over 70%. A blended renewal rate is the renewal rate for all registrations (first year and multi-year). The first year renewal rates can vary considerably. Some of the new gTLDs can see upwards of 80% of the zone deleted within a year. The first year renewal rate varies considerably across ccTLDs and gTLDs and from month to month. With heavily discounted new registrations, it is not unusual to see over 90% of those discounted registations deleted when they come up for renewal.
Access to WHOIS servers and the records that they return has changed considerably since the infliction of GDPR in May 2018. Some gTLDs are moving to RDAP and will throttle the number of requests by IP or limit them to a maximum number of queries per minute. A lot of personal data such as e-mail addresses, phone numbers and even postal addresses have been removed from gTLD records because of the fear of GDPR. Some of the European ccTLD registries simply don't return any personal information for a WHOIS request. (deNIC is a good example.) Others allow registrants to opt out of public WHOIS.
One of the largest "registrations as a service" gTLD registrars (it helps resellers resell gTLD domain names without them having to become ICANN accredited registrars) unilaterally decided to go dark on WHOIS data a few years ago by removing registrant data.
The zones change. New domain names are registered and domain names are deleted. For many TLDs, the old WHOIS model of registrant name, e-mail and phone number no longer exists. And there are also WHOIS privacy services which have obscured ownership.