I would like to restrict access from certain countries to content on my
network (for security and legal reasons).
So far the best algorithm I've been able to come up with is a combination
of reverse DNS and APNIC/ARIN/RIPE whois queries. I've written a perl
cgi that checks reverse DNS first, and if there is no gtld country code
for the reverse mapping, does a whois query and parses the response for
the address.
The problem I have is that the country for the company that owns the IP
block is sometimes not the country the IP block is used in. For example sungold22.de.ibm.com 194.196.100.86
Whois parsing indicates a country of UK, but from the reverse DNS a person
can see that it is Germany. I've built the pattern of cc.ibm.com into my
cgi, but I'm sure there are other blocks that I'm incorrectly identifying.
I've looked at RADB entries, as well as origin AS for various IP blocks,
and neither source looks any better than whois.
Is there a more accurate method to determine the country of origin for an
IP than the methods I've described above?
I don't know where they get their data from, how accurate it is, or what it costs, but I thought I'd mention that there is at least a way to make the problem someone else's by the simple application of money
Is there a more accurate method to determine the country of origin for an
IP than the methods I've described above?
Physical geography and DNS do not match. Some of the most popular web sites
in Indian under the .in domain are physically in the US and owned by US
companies. Having a web site under the .in domain is a means to reach a
market.
Physical geography and IP addresses do not match. Once the RIR allocates to
the LIR, the LIR can sub-allocate anywhere. So a LIR (ISP) in Singapore with
a regional business could allocate their address block to customers in
Singapore, Hong Kong, China, India, and any other place where they offer
services.
DNS LOC Recorded might be helpful. But, as noted in one CAIDA paper ...
"Both the whois-based and hostname-based mapping rely on the assumption that
educated guesses are required in the absence of explicit location
information. While RFC 1876 [RFC1876] did define a DNS extension to provide
a LOC resource record type that allows administrators to associate latitude
and longitude information with entries, it turns out to be sub-optimally
useful. First, the RFC specifies only the format and interpretation of the
new field, without establishing where or at what
granularity to use it. Because of this, finding the appropriate LOC resource
record may require multiple DNS queries. More importantly, people just do
not use it. NetGeo currently does not use DNS LOC queries by default because
their low success rate does not justify the expense
of the three or more DNS lookups typically needed to rule out the existence
of a valid DNS LOC record."
---> http://www.caida.org/outreach/papers/2000/inet_netgeo/inet_netgeo.html#dnslo
c
There are tools that CAIDA has worked on like NetGeo (now something sold by
Ixia) http://www.caida.org/tools/utilities/netgeo/. Might be something to
check out along with all the other Internet mapping projects.
I am not aware of any commercial service tht has a /32s in its databases.
Neither am I aware of any of the companies that have the data providing the
service of 'lookup the location'. It is incorporated into the other services
that they provide and are used for internal purposes.
>
> > > > Is there a more accurate method to determine the country of origin for an
> > > > IP than the methods I've described above?
> >
> > Yes, at least three companies have databases of pretty much all /24s and
> > above mapped up to a zip code.
>
> So far I've been referred to 3 commercial services, and all (including
> NetGeo/Ixia) fail on the example I gave (194.196.100.86).
The Akamai EdgeScape service is correct for 194.196.100.86.
Maybe I missed those posts, sorry.
I am not aware of any commercial service tht has a /32s in its databases.
Neither am I aware of any of the companies that have the data providing the
service of 'lookup the location'. It is incorporated into the other services
that they provide and are used for internal purposes.
I'm not sure how far Akamai goes in its database. I do know for a fact that
there are entries more specific than /24s in its database.
> > Is there a more accurate method to determine the country of origin for an
> > IP than the methods I've described above?
Yes, at least three companies have databases of pretty much all /24s and
above mapped up to a zip code.
These DBs are a joke. I have /19's that are SWIPed to the billing
office but used in remote POPs. No-one is ever gonna figure out where
they really are.
Except for the IPs I set RFC1712 LOC records on.
I see load-balancing by geo-code do way more harm than good.
> Just because free public dbs dont have that info does not
> mean that it does not exist.
i guess the question is, "how to ascertain the accuracy of the data?"
if you have a collection of n known address to location mappings,
evenly distributed over the address space, you'd want to approach one
of the private db vendors and say, "do lookups on these n addresses
and tell me the answers." if there's a good correlation between
the known data and the answers, then it might make sense to purchase
data [*] from those people.
but.
it is probably necessary to construct the set of control data by hand,
which might be a big job.
what is a sufficiently large n?
for n sufficiently large, are the vendors likely to answer the
question?
i suspect that, in real life, it will come down to trusting the
vendors' assertion that their data is accurate...
-w
[*] purchase data!?!? doesn't information want to be free? or is that
pass�? oh well...
> > Yes, at least three companies have databases of pretty much all /24s and
> > above mapped up to a zip code.
>
> These DBs are a joke. I have /19's that are SWIPed to the billing
> office but used in remote POPs. No-one is ever gonna figure out where
> they really are.
Wrong answer.
Just because free public dbs dont have that info does not mean that it does
not exist.
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS.
No traceroutes or pings can make it past these firewalls, nor do the hostnames
indicate any particular location. How exactly do you plan on mapping these to a
zip code, when I can tell you those addresses are fairly randomly spread, in /24
increments, to sites all over the world?
The neat thing about selling databases like that is nobody can ever prove how
incredibly inaccurate they are. Just come up with a reasonable-sounding
collection methodology and claim any counterexamples are just flukes, then
collect money from the saps who believe you...
> Wrong answer.
>
> Just because free public dbs dont have that info does not mean that it does
> not exist.
Say I have about 10 /16's reachable through firewalls in SJC, RDU, SYD, and AMS.
No traceroutes or pings can make it past these firewalls, nor do the hostnames
indicate any particular location. How exactly do you plan on mapping these to a
zip code, when I can tell you those addresses are fairly randomly spread, in /24
increments, to sites all over the world?
It is very easy. Anyone would care about it only when users from those
addreses interact with whatever the software that ends up creating those
databases. If those users never buy stuff from Amazon.com, Amazon.com does
not care where they are. But eh moment they do, somewhere someone is
cruniching the data that says "Of 10 sites that I saw this IP address access
and provide a clearing for the credit card transaction, 9 ended up being
within 3 miles radius of ZZZZ. Lets put a tag on that"
The neat thing about selling databases like that is nobody can ever prove how
incredibly inaccurate they are. Just come up with a reasonable-sounding
collection methodology and claim any counterexamples are just flukes, then
collect money from the saps who believe you...
The really neat things about talking to computer geeks is that they all
operate with the lots of absolutes. They will explain to you why in a
specific case it does not work and forget that those specific cases are
usually exceptions.
ALex
P.S. So, ever bought stuff from Amazon from one of those IP addresses and
sent it to some non-related location *just* to confuse the mapping
systems?
Again, majority of companies that have that data will not provide it to you
for free. In a case of someone like Amazon, they probably wont measure
mileage. Rather whey would flag transactions that make no geographic sense
and pull them for separate processing.