RIPE Database Proxy Service Issues

Axel_Pawlik · January 2, 2013, 4:00pm

[Apologies for duplicate emails]

Dear colleagues,

There has been discussion on various mailing lists regarding the status of the RIPE Database Proxy Service.

Before I address the issues that arose, I'd like to give you some background information on the service itself that may help with the discussions.

Technical Background

Rodney_Joffe1 · January 2, 2013, 10:49pm

Hell Axel,

[Apologies for duplicate emails]

Dear colleagues,

There has been discussion on various mailing lists regarding the status of the RIPE Database Proxy Service.

We do apologise, however, that the changes regarding the proxy service were not more explicitly communicated to the members and the RIPE community in advance of the final publication of the Activity Plan.

Not being members, we obviously were not privy to these discussions or decisions. Not your fault, of course, just a reality.

The RIPE NCC asks that non-RIPE NCC member proxy service users become members but we propose to waive their membership fee until the discussion of the RIPE NCC Charging Scheme 2014 takes place. This will give the membership and community the opportunity to discuss the best way forward for the proxy service in the coming months while ensuring a strong contractual bond between the RIPE NCC and users of this service.

In the meantime, there will be no changes to the proxy service and no loss of functionality for the community.

I appreciate the decision and accommodation… And I am sure the community appreciates it. As users have no doubt realized, the proxy data continued to be available after Dec 31. We were waiting to see what the "DENIED" output looked like before we implemented our changes, so there was no impact. This too is appreciated.

And thank you to the many community and RIPE members who offered and provided assistance and support.

Thank you.

Rodney Joffe
CenterGate Research/GeekTools

Warren_Bailey1 · January 2, 2013, 10:56pm

This looks to be a happy ending. I thought we were going to get to see a fight.

Rich_Kulawiec · January 3, 2013, 1:48pm

1. The technical measures you've outlined will not prevent, and have
not prevented, anyone from automatically harvesting the entire thing.
Anyone who owns or rents, for example, a 2M-member botnet, could easily
retrieve the entire database using 1 query per IP address, spread out
over a day/week/month/whatever. (Obviously more sophisticated approaches
immediately suggest themselves.)

Of course a simpler approach might be to buy a copy from someone who
already has.

I'm not picking on you, particularly: all WHOIS operators need to stop
pretending that they can protect their public databases via rate-limiting.
They can't. The only thing that they're doing is preventing NON-abusers
from acquiring and using bulk data.

2. This presumes that the database is actually a target for abusers.
I'm sure for some it is. But as a source, for example, of email
addresses, it's a poor one: the number of addresses per thousand records
is relatively small and those addresses tend to belong to people with
clue, making them rather suboptimal choices for spamming/phishing/etc.

Far richer targets are available on a daily basis simply by following
the dataloss mailing list et.al. and observing what's been posted on
pastebin or equivalent. These not only include many more email addresses,
but often names, passwords (encrypted or not), and other personal details.
And once again, the simpler approach of purchasing data is available.

3. Of course answering all those queries no doubt imposes significant
load. Happily, one of the problems that we seem to have pretty much
figured out how to solve is "serving up many copies of static
content" because we have tools like web servers and rsync.

So let me suggest that one way to make this much easier on yourselves is
to export a (timestamped) static snapshot of the entire database once
a day, and let the rest of the Internet mirror the hell out of it.
Spreads out the load, drops the pretense that rate-limiting
accomplishes anything useful, makes all the data available to everyone
equally, and as long as everyone is aware that it's a snapshot and not
a real-time answer, would probably suffice for most uses. (It would
also come in handy during network events which render your service
unreachable/unusable in whole or part, e.g., from certain parts of
the world. Slightly-stale data is way better than no data.)

The same thing should be done with all domain WHOIS data, too, BTW.
The spammers/phishers/etc. have been getting copies of it for a very
long time, whether by mass harvesting, exploiting security holes, paying
off insiders, or other means, so it's security theater to pretend
that limiting query rates has any value.

---rsk