I'm attempting to find out information on the SEO implications of testing ipv6 out.
A couple of concerns that come to mind are:
1) www.domain.com and ipv6.domain.com are serving the exact same content.
Typical SEO standards are to only serve good content from a single domain so information isn't watered down and so that the larger search engines won't penalize. So a big concern is having search results take a hit because content is duplicated through two different domains, even though one domain is ipv4 only and the other is ipv6 only.
2) Not running ipv6 natively, or using 6to4.
This (potentially) increases hop count and will put content on a slower GRE tunnel and add some additional time for page load times.
3) ??? Any others that I haven't thought of ???
So basically I'd love to set up some sites for ipv6.domain.com via 6to4 as a phase one, and at some point in the near future implement ipv6 natively inside the datacenter, but I'm somewhat concerned about damaging SEO reputation in the process.
Thoughts?
-wil
If you are so concerned about SEO, just dual-stack your site. It works
well for me.
William
If you're worried about SEO, go with native IPv6 and then deploy AAAAs for WWW.domain.foo.
It's been working just fine for www.he.net for years.
Owen
Why is native IPv6 needed? I'd have thought a tunnel would be fine, too.
Regards, K.
Why is native IPv6 needed? I'd have thought a tunnel would be fine, too.
I believe the concern is that the higher latency of a tunnel would impact SEO rankings.
So why does
www A 127.0.0.1
www AAAA ::1
Preclude a tunnel? I can't get native here to my IPv6 is tunneled thru he (Thanks he) but that doesn't change dual DNS entires.
(Note used loopback as an example)
Tom
True but you live with what you can get acces to 
Tom
So far the consensus is to run dual stack natively.
While this definitely is the way things should be set up in the end, I can see some valid reasons to run ipv4 and ipv6 on separate domains for a while before final configuration. For example, if I'm in an area with poor ipv6 connectivity I'd like to be given the option of explicitly going to an ipv4 site vs the ipv6 version.
I'd also like to not damage SEO in the process though. 
-wil
I would be getting ipv6 connectivity, adding an unknown AAAA record
such as ipv6 or www6; but not www, and do as many comparative ipv4 vs
ipv6 tracerouts from as many route servers as possible. Then you will
have the data you need to actually make an informed decision rather
than just guessing how it will behave. Remove the temp record and add
a real quad for www only if you liked what you saw.
I assume the name servers are also available over ipv6 including glue?
\n
I would be getting ipv6 connectivity, adding an unknown AAAA record such as
ipv6 or www6; but not www, and do as many comparative ipv4 vs
ipv6 tracerouts from as many route servers as possible. Then you will have the
data you need to actually make an informed decision rather than just guessing
how it will behave. Remove the temp record and add a real quad for www
only if you liked what you saw.
I assume the name servers are also available over ipv6 including glue?
Why do you even need a AAAA record to do that? Just do a traceroute to the v6 address. The temporary AAAA record seems to do nothing useful in your proposed procedure.
Easiest hack to test site usability: Modify your hosts file. Don't even publish the record in DNS until you're ready. Then there's no SEO implications. 
So far the consensus is to run dual stack natively.
While this definitely is the way things should be set up in the end, I can see
some valid reasons to run ipv4 and ipv6 on separate domains for a while
before final configuration. For example, if I'm in an area with poor ipv6
connectivity I'd like to be given the option of explicitly going to an ipv4 site vs
the ipv6 version.
I'd also like to not damage SEO in the process though. 
If you're going to expose the site via a separate hostname (v6.bobdole.com), create a v6.robots.txt file that tells Google not to index v6.bobdole.com. Use an .htaccess rule to rewrite requests for robots.txt based on the host header, so v4 requests get the v4.robots.txt, and v6 requests get the v6.robots.txt, which tells Google not to index things.
Nathan
Why do you even need a AAAA record to do that? Just do a traceroute to the
v6 address. The temporary AAAA record seems to do nothing useful in your
proposed procedure.
Easiest hack to test site usability: Modify your hosts file. Don't even
publish the record in DNS until you're ready. Then there's no SEO
implications. 
You could go direct to the v6 addy, but using your hosts file for a dns
record isn't going to work for the remote route servers I suggest testing
from. Using a temp AAAA doesn't hurt, or lose you anything, and is
technically a more accurate test, ultimatly I leave it to your discretion.
\n
He was worried about the latency of tunnels creating penalties for SEO
purposes, but, otherwise, yes, that works too.
Since he stated a desire to avoid tunnels as an initial area of concern,
I went with his original statement.
Owen
Well, hard to tunnel to a loopback address, but, using a better example:
www IN A 192.0.2.50
IN AAAA 2001:db8::2:50
Would not preclude a tunnel at all. The issue is that he seemed concerned
with additional latency from a tunnel resulting in SEO penalties, so, I suggested
native as a resolution to that concern.
Owen
In a message written on Mon, Mar 28, 2011 at 03:18:30PM -0700, Wil Schultz wrote:
I'm attempting to find out information on the SEO implications of testing ipv6 out.
I don't run a web site where SEO is a top priority, so I don't track
such things.
Quite simply, who's crawling on IPv6? That is, will any of the
search engines even notice?
The only crawling I have seen over IPv6 has come from Google - but I have only seen that on IPv6-only sites, not dual-stack sites:
2001:4860:4801:1302:0:6006:1300:b075 - - [28/Mar/2011:21:54:12 -0400] "GET /p/OWJjZD HTTP/1.1" 200 3790 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
The real name for SEO is Search-Engine manipulation. And the moment you
indicate "typical SEO standards", the search engine developers have
likely already
become aware of the existence of the problem/tactic and fiddled with
knobs plenty of
times since then....
Sometimes search engines penalize what they see to be duplicate content in
the indexes. Spammers sometimes try to include the same content in
many domains
or steal content from other sites to enhance page rank. Big search
engines offer some
method of canonicalization or selection of a preferred domain through sitemaps.
Use the tools provided by your search engine to tell them
ipv6.domain.com is just domain.com.
If IPv4 and IPv6 are combined in one index, there is a risk that the
IPv4 pages could
get penalized and only the IPv6 pages show at the top (or vice-versa).
You could use robots.txt to block access to one of the sites for just the robots
that penalized or a rel=nofollow. If even necessary... I for one am
completely unconvinced that major search engines are penalizing in this scenario
currently, solely because a site was duplicated to a "ipv6" subdomain.
Keep in mind there is a search engine using this practice for their own domain.
Who knows... in the future they may be penalizing sites that _don't_ have an
IPv6 subdomain or v6 dual-stacking (assuming they are not penalizing that /
rewarding IPv6 connected sites already).
In this case attempting to put old SEO tactics first may hurt visitor
experience
more than help.
ipv6.domain.com available over IPv6 and domain.com available
over IPv4 are
not really "different" domains; I expect search engines may keep IPv4 and IPv6
indexes separate.
At least for a time... since there are IPv4-only nodes who would not
be able to access IPv6
hyperlinks in a search results page.
There has been a discussion of this in v6ops, around
http://tools.ietf.org/html/draft-ietf-v6ops-v6-aaaa-whitelisting-implications
"IPv6 AAAA DNS Whitelisting Implications", Jason Livingood, 22-Feb-11
and
http://tools.ietf.org/html/draft-ietf-v6ops-happy-eyeballs
"Happy Eyeballs: Trending Towards Success with Dual-Stack Hosts", Dan
Wing, Andrew Yourtchenko, 14-Mar-11
In that context, you might review http://www.ietf.org/proceedings/80/slides/v6ops-12.pdf
Where you find a name ipv6.example.com, such as ipv6.google.com and www.v6.facebook.com, it is generally a place where the service is testing the IPv6 configuration prior to listing both the A and the AAAA record under the same name. The up side of giving them the same name is that the same content is viewable using IPv4 and IPv6; being IP-agnostic is a good thing. Unfortunately, at least right now, there is a side-effect. The side-effect is that a temporary network problem (routing loop etc) on one technology can be fixed by using the other, and the browsers don't necessarily fall back as one would wish. This works negatively against IPv6 deployment and customer satisfaction; it is not unusual for tech support people to respond to such questions with "turn off IPv6 and you won't have that problem".
Hence, content providers often separate the names to ensure that people only get the IPv6 experience if they expect it. And Google among others whitelists people for IPv6 DNS service based on their measurements of the client's path to google - if a bad experience is likely, they try to prevent it by not offering IPv6 names.
In general, I don't see a lot of difference between A and AAAA accesses, but I have had glitches when there was a network glitch. On one occasion, there was an IPv6 routing loop en route to www.ietf.org, but not one on the IPv4 path. The net result was a huge delay - it took nearly two minutes to download a page. The amusing part of that was that the same routing loop got in the way of reporting the issue to HE; I wound up sending an email rather than filing a case. Once it was fixed, matters returned to normal.
Do we know which spiders run on IPv6?
After all an IPv6 only site may not be indexed at all...
Twitter said:
http://twitter.com/#!/look4ipv6/status/24639157611528193
Al least you would have a better page-rank in www.example.com than in ipv6.example.com with the same content.
.as
I haven't published any hostnames, so I don't have the entire picture... with that being said, here's what I have seen. The only big search engine I've seen at this point is google, here are the user agents I've seen to date:
Googlebot-Image/1.0
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)