They seem to do something a little unusual where every DNS request provides a different IP out of a small pool with those IPs not changing very frequently. (I’m talking specifically about S3 not Route5x or whatever the DNS product is).
Basically like round robin, but instead of providing all of the IPs they are only offering one. This eliminates options for the client DNS resolvers, but may make some things more deterministic.
Is this a “normal” or expected solution or just some local hackery?
Thanks in advance,
DJ
Route53.
Not sure what you mean by "S3 DNS". I wasn't aware S3 had any DNS
functionality at all... on the other hand, there is much indeed that I
do not know.
Regards, K.
The IP addresses for S3 do not change very often, and are region specific (as you would expect).
You are correct that this can cause problems for clients that never re-resolve (eg Java networkaddress.cache.ttl=-1)
You may be interested in the (periodically updated) list of AWS IP ranges by using their IP ranges JSON API. Refer to:
* https://ip-ranges.amazonaws.com/ip-ranges.json
* AWS IP address ranges - AWS General Reference
To get all S3 IP ranges currently in use:
“”"
curl -sf ‘https://ip-ranges.amazonaws.com/ip-ranges.json’ \
jq '.prefixes | map(select(.service == "S3"))'
"""
To get all S3 IP ranges in your region:
“”"
curl -sf ‘https://ip-ranges.amazonaws.com/ip-ranges.json’ \
jq '.prefixes | map(select(.service == "S3" and .region == "eu-central-1"))'
"""
These ranges are not (to my knowledge) queryable via DNS.
In terms of this as a general behaviour, it is not uncommon. If I remember correctly this is how Route53 weighted records are implemented. So at least anyone using that feature of Route53 would be doing the same.
Met vriendelijke groeten,
Toby Lorne
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
(I’m talking specifically about S3 not Route5x or whatever the DNS
product is).
Route53.
Not sure what you mean by “S3 DNS”. I wasn’t aware S3 had any DNS
functionality at all… on the other hand, there is much indeed that I
do not know.
Maybe Deepak means:
“When I ask for an S3 endpoint I get 1 answer, which is 1 of a set of N. Why would
the ‘loadbalancer’ send me all N?”
(I don’t know a aws s3 url to test this out with, an example from Deepak would be handy)
Maybe Deepak means:
"When I ask for an S3 endpoint I get 1 answer, which is 1 of a set of N. Why would
the 'loadbalancer' send me all N?"
(I don't know a aws s3 url to test this out with, an example from Deepak would be handy)
Regards, K.
Hello,
Is this a “normal” or expected solution or just some local hackery?
It's absolutely normal and expected for a huge service like this to
keep round robin at the DNS server side. YMMV with client side DNS
based round robin (Amazon needs to be in control, not your client
application) and steering traffic from one edge location or host to
another is perfectly legitimate. Also likely as a service provider of
such a huge service you want to keep breaking connections from
applications with clearly hardcoded (or "resolve at startup only") IP
addresses, so that client applications never use this approach (in the
long term at least). After all, as a service provider you want to
avoid hitting the news cycle for a legitimate DNS change, just because
you are not doing it very often and that change triggered a myriad of
outages because of broken customer applications at the same time. So
they just do it often or all the time.
Amazon needs to stay in control of what edge nodes and locations the
clients are hitting, just like CDN's and other endpoints with major
traffic volumes.
None of this is local hackery, it's just basic DNS.
Lukas
Lukas
I've just taken a squiz at an S3-based website we have, and via the S3
URL it is a CNAME with a 60-secod TTL pointing at a set of A records
with 5-second TTLs.
Any one dig returns the CNAME and a single IP address:
dig our-domain.s3-website-ap-southeast-2.amazonaws.com.
our-domain.s3-website-ap-southeast-2.amazonaws.com. 14 IN CNAME s3-
website-ap-southeast-2.amazonaws.com.
s3-website-ap-southeast-2.amazonaws.com. 5 IN A 52.95.134.145
If the query is multiply repeated, the returned IP address changes,
roughly every five seconds.
What's interesting is the name attached to the A records, which does
not include "our-domain". It seems to be a record pointing to ALL S3
websites in the region. And all of the addresses I saw reverse-resolve
to that one name. So there is definitely some under-the-bonnet magic
discrimination going on.
In Route53 the picture is very different, with the published website
host name (think "our-domain.com.au") resolving to four IP addresses
that are all returned in the response to a single dig query. There is
an A-ALIAS (a non-standard AWS record type) that points to a CloudFront
distribution that has the relevant S3 bucket as its origin.
Using the CNAME bypasses the CloudFront distribution unless steps are
taken to forbid direct access to the bucket. It would be usual to use
(and enforce) access via CloudFront, if for no other reason than to
provide for HTTPS access.
Regards, K.
Hello,
AWS is doing Geo-based load balancing and spitting things out,
and networks with eyeballs are doing their own things for traffic
management and trying to do shortest paths to things – and responsible
operators want to minimize the non-desirable and non-deterministic
behaviors.
You can't use DNS to get "all" service IP's of a service like S3 or a
CDN for traffic engineering purposes. That will not work, ever (for
services of such scale).
The hackery is assuming you can build a list of service IP's by querying DNS.
There are a lot of reasons why someone may want this… particularly
to manage *other* people geo-basing their transport, but is this a
local hack or is this a feature of one of the major auth-DNS packages.
If its local hackery, trying to manage for it becomes a thankless activity.
CDN's and huge service work like this, and they use the standardized
tools like DNS they have at their disposal.
Building lists of service IP's from DNS is what the "local-hackery" here is.
Toby explained the proper way to get the IP ranges. It's not via DNS,
it never was.
Lukas
I've just taken a squiz at an S3-based website we have, and via the S3 URL it is a CNAME with a 60-secod TTL pointing at a set of A records with 5-second TTLs.
Any one dig returns the CNAME and a single IP address:
dig our-domain.s3-website-ap-southeast-2.amazonaws.com.
our-domain.s3-website-ap-southeast-2.amazonaws.com. 14 IN CNAME s3-
website-ap-southeast-2.amazonaws.com.
s3-website-ap-southeast-2.amazonaws.com. 5 IN A 52.95.134.145
If the query is multiply repeated, the returned IP address changes, roughly every five seconds.
What's interesting is the name attached to the A records, which does not include "our-domain". It seems to be a record pointing to ALL S3 websites in the region. And all of the addresses I saw reverse-resolve to that one name. So there is definitely some under-the-bonnet magic discrimination going on.
In Route53 the picture is very different, with the published website host name (think "our-domain.com.au") resolving to four IP addresses that are all returned in the response to a single dig query. There is an A-ALIAS (a non-standard AWS record type) that points to a CloudFront distribution that has the relevant S3 bucket as its origin.
Using the CNAME bypasses the CloudFront distribution unless steps are taken to forbid direct access to the bucket. It would be usual to use (and enforce) access via CloudFront, if for no other reason than to provide for HTTPS access.
You can't use DNS to get "all" service IP's of a service like S3 or a CDN for traffic engineering purposes. That will not work, ever (for services of such scale).
The hackery is assuming you can build a list of service IP's by querying DNS.
There are a lot of reasons why someone may want this… particularly to
manage *other* people geo-basing their transport, but is this a local
hack or is this a feature of one of the major auth-DNS packages.
If its local hackery, trying to manage for it becomes a thankless activity.
CDN's and huge service work like this, and they use the standardized tools like DNS they have at their disposal.
Building lists of service IP's from DNS is what the "local-hackery" here is.
Toby explained the proper way to get the IP ranges. It's not via DNS, it never was.
Hi Deepak.
Amazon documents the IPs for their public and private cloud services: https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html
(I know this because Batfish uses these in its reachability analysis, for example, “Make sure all outgoing flows to S3 are permitted by the firewall”.)
Thanks,
Dan
(I’m talking specifically about S3 not Route5x or whatever the DNS
product is).
Route53.
Not sure what you mean by “S3 DNS”. I wasn’t aware S3 had any DNS
functionality at all… on the other hand, there is much indeed that I
do not know.
Maybe Deepak means:
“When I ask for an S3 endpoint I get 1 answer, which is 1 of a set of N. Why would
the ‘loadbalancer’ send me all N?”
(I don’t know a aws s3 url to test this out with, an example from Deepak would be handy)
also, just for grins:
$ while /bin/true; do dig +short s3.amazonaws.com @ns-63.awsdns-07.com.>> /tmp/aws; sleep 1; done
after a time:
$ wc -l /tmp/aws
17787 /tmp/aws
and:
$ sort -n /tmp/aws | uniq -c | sort -rn | wc -l
6457
Some of the results appear ~11 times? most likely only 1x.