Extra latency at ATT exchange for UVerse

Srikanth_Sundaresan1 · November 11, 2010, 8:39pm

Can anyone explain why ATT's UVerse adds significant delay to packets
compared to their ADSL service?

For example, pinging 8.8.8.8 from an ADSL gateway shows a latency of
~10ms. From an UVerse gateway, it's about 40ms. Of the extra 30ms,
about 10ms can be explained by the fact that UVerse last hop is
interleaved. ADSL seems to have Fastpath enabled more often than not
(at least in my city).

The extra 20ms is more interesting. By pinging each hop obtained by
tracerouting to 8.8.8.8, the extra latency seems to be added on the
exchange between ATT and Google. It's not just for 8.8.8.8. The same
holds for other hosts too. ATT seems to add 20ms when it hands off a
(UVerse) packet at an exchange.

Thanks,
Srikanth

Richard_A_Steenbegen · November 11, 2010, 9:19pm

First off, this thread is useless without actual traceroutes.

Whenever you see the latency change significantly at the boundry between
networks, the two most obvious things to look for are congestion, and an
asymmetric reverse path.

Congestion is usually pretty easy to spot, if you're seeing it with high
latency you'll usually find that latency to be pretty jittery (as tcp
windows probe for more capacity, then back off), and you'll see the
associated packet loss starting at the link in question.

Asymmetric reverse paths are responsible for a lot of other issues too.
Traceroute measures the round-trip latency but only shows you the path
in a single direction, leaving the entire return trip completely
invisible. There is no guarantee that the packet will come back to you
the same way that you sent it, so what you may be seeing is the traffic
returning via a different exit between networks. The best way to
troubleshoot something like this is to get a copy of a traceroute in the
opposite direction. For more information, see:

http://www.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N47_Sun.pdf

One other thing to keep in mind is that a company like Google may be
more interested in keeping their servers located somewhere with ample
(and cheap) space and power, than they are with ensuring close proximity
to an Internet interconnection point. For example, Google is well known
for building a datacenter in The Dalles Oregon, which is a significant
distance away from ANY network interconnection. From Chicago, directly
connected to Google, 8.8.8.8 is actually located an rtt of 12ms away:

1 core1-2-2-0.ord.net.google.com (206.223.119.21) 1.509 ms 1.769 ms 1.409 ms
2 72.14.236.176 (72.14.236.176) 1.677 ms 1.579 ms 1.878 ms
3 72.14.232.141 (72.14.232.141) 12.555 ms
    209.85.241.22 (209.85.241.22) 12.150 ms 12.013 ms
4 209.85.241.37 (209.85.241.37) 11.974 ms
    209.85.241.35 (209.85.241.35) 12.591 ms
    209.85.241.37 (209.85.241.37) 12.125 ms
5 209.85.240.49 (209.85.240.49) 12.944 ms
    72.14.239.189 (72.14.239.189) 21.509 ms
    209.85.240.45 (209.85.240.45) 25.000 ms
6 google-public-dns-a.google.com (8.8.8.8) 12.890 ms 12.487 ms 12.770 ms

This would put the fiber distance at around 500+ miles, i.e. this
datacenter could actually be in Kansas City MO for all you know. Without
the original traceroute to verify your assumptions about where the
interconnection point between networks is, it's entirely possible that
you could be seeing something like this too.

Dan_White · November 11, 2010, 10:06pm

You've probably been moved to a new DSLAM, using different DSL (VDSL)
technology, which will probably have some effect on latency.

Do a google search for 'DSL Interleaved' for some discussion of the topic.
We always do interleaved for video customers for improved reliability.

Srikanth_Sundaresan1 · November 11, 2010, 10:11pm

Here are the traceroutes (without the first 3 hops)

From ADSL:

traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 40 byte packets

4 12.81.16.32 30.196 ms 32.292 ms 35.161 ms
5 12.81.16.25 37.774 ms 40.627 ms 44.209 ms
6 74.175.192.78 48.008 ms 50.841 ms 53.946 ms
7 12.122.140.186 59.278 ms 61.510 ms 61.824 ms
8 12.123.22.129 61.111 ms 59.803 ms 59.382 ms
9 12.88.97.6 116.059 ms 115.757 ms 116.331 ms
10 72.14.233.54 59.856 ms 60.354 ms 61.088 ms
11 72.14.232.213 61.312 ms 78.592 ms 209.85.254.243 60.396 ms
12 209.85.253.137 105.800 ms 100.558 ms 209.85.253.141 96.095 ms
13 8.8.8.8 96.571 ms 98.721 ms 98.514 ms

From UVerse:

4 76.201.204.10 24.020 ms 24.321 ms 24.250 ms
5 76.201.208.22 25.754 ms 25.701 ms 25.633 ms
6 76.201.208.8 25.558 ms 25.230 ms *
7 70.159.177.248 24.910 ms 22.452 ms 23.436 ms
8 12.81.16.2 24.478 ms 24.420 ms 24.514 ms
9 12.81.16.21 128.798 ms 127.685 ms 126.821 ms
10 74.175.192.90 22.999 ms 21.932 ms 23.057 ms
11 12.122.140.186 24.397 ms 12.122.141.186 24.647 ms 24.594 ms
12 12.123.22.5 32.763 ms 12.123.22.129 22.016 ms 12.123.22.5 26.850 ms
13 * * *
14 72.14.233.54 40.287 ms 72.14.233.56 40.716 ms 40.660 ms
15 209.85.254.241 41.964 ms 41.909 ms 41.842 ms
16 209.85.253.137 51.698 ms 209.85.253.133 44.534 ms 209.85.253.145
39.621 ms
17 8.8.8.8 41.278 ms 42.124 ms 42.718 ms

Both the homes are in the same city. The entry point to Google is the
same: 72.14.233.54 (from whois).

From ADSL, latency to that google router is about 10ms:

rtt min/avg/max/mdev = 9.461/13.137/59.856/7.841 ms

from UVerse, it's about 40ms.
rtt min/avg/max/mdev = 38.923/44.503/70.535/7.162 ms

There isn't enough jitter to justify this difference. And it's not
just to Google. i tested to another server (where ATT hands off to
Qwest), and it's the same. It can't be congestion/location, because if
it were, the ADSL gateway should see it too. Reverse path effects,
perhaps.

- Srikanth

Richard_A_Steenbegen · November 12, 2010, 1:56am

Here are the traceroutes (without the first 3 hops)

(Note: NANOG is not really the right place to troubleshoot everyone's
home connectivity, I'm mostly just posting this as an educational
example of how to do inter-network troubleshooting... though in
retrospect this may not be the worlds best example :P).

>From ADSL:
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 40 byte packets

4 12.81.16.32 30.196 ms 32.292 ms 35.161 ms
5 12.81.16.25 37.774 ms 40.627 ms 44.209 ms
6 74.175.192.78 48.008 ms 50.841 ms 53.946 ms
7 12.122.140.186 59.278 ms 61.510 ms 61.824 ms
8 12.123.22.129 61.111 ms 59.803 ms 59.382 ms
9 12.88.97.6 116.059 ms 115.757 ms 116.331 ms
10 72.14.233.54 59.856 ms 60.354 ms 61.088 ms
11 72.14.232.213 61.312 ms 78.592 ms 209.85.254.243 60.396 ms
12 209.85.253.137 105.800 ms 100.558 ms 209.85.253.141 96.095 ms
13 8.8.8.8 96.571 ms 98.721 ms 98.514 ms

>From UVerse:

4 76.201.204.10 24.020 ms 24.321 ms 24.250 ms
5 76.201.208.22 25.754 ms 25.701 ms 25.633 ms
6 76.201.208.8 25.558 ms 25.230 ms *
7 70.159.177.248 24.910 ms 22.452 ms 23.436 ms
8 12.81.16.2 24.478 ms 24.420 ms 24.514 ms
9 12.81.16.21 128.798 ms 127.685 ms 126.821 ms
10 74.175.192.90 22.999 ms 21.932 ms 23.057 ms
11 12.122.140.186 24.397 ms 12.122.141.186 24.647 ms 24.594 ms
12 12.123.22.5 32.763 ms 12.123.22.129 22.016 ms 12.123.22.5 26.850 ms
13 * * *
14 72.14.233.54 40.287 ms 72.14.233.56 40.716 ms 40.660 ms
15 209.85.254.241 41.964 ms 41.909 ms 41.842 ms
16 209.85.253.137 51.698 ms 209.85.253.133 44.534 ms 209.85.253.145
39.621 ms
17 8.8.8.8 41.278 ms 42.124 ms 42.718 ms

Both the homes are in the same city. The entry point to Google is the
same: 72.14.233.54 (from whois).

Actually the entry point to Google is probably the hop before that,
12.88.97.6. In all likelihood this is the /30 between the two networks,
where .5 is the AT&T side and .6 is the Google side. The IP space of the
demarc point belongs to AT&T of course, but this is what you'd expect in
a provider->customer relationship. In an ordinary network you would
be able to confirm this with DNS and/or some traceroutes to the routers,
but both AT&T and Google have intentionally obfuscated the hell out of
their networks from the outside world (no dns, blocking traceroutes
directly to router IPs, etc), so that won't help you much. There is also
no Google looking glass (at least that I can find), nor do they support
record-route, so you're probably SOL on the reverse path too.

>From ADSL, latency to that google router is about 10ms:
rtt min/avg/max/mdev = 9.461/13.137/59.856/7.841 ms

from UVerse, it's about 40ms.
rtt min/avg/max/mdev = 38.923/44.503/70.535/7.162 ms

There isn't enough jitter to justify this difference. And it's not
just to Google. i tested to another server (where ATT hands off to
Qwest), and it's the same. It can't be congestion/location, because if
it were, the ADSL gateway should see it too. Reverse path effects,
perhaps.

Well we can start by eliminating the possibility that the 8.8.8.8 node
you're hitting is a significant distance away once you hit Google's
network. What little bit of DNS AT&T does have working shows that this
is coming out of Atlanta, which could also be confirmed with a few
traceroutes from route-server.ip.att.net. From there, it's trivial to
find a network with a looking glass and direct Google connectivity in
Atlanta, and match up the exact same path:

2 72.14.233.54 (72.14.233.54) 0.944 ms 0.902 ms
    72.14.233.56 (72.14.233.56) 0.720 ms
3 209.85.254.241 (209.85.254.241) 1.005 ms
    209.85.254.243 (209.85.254.243) 16.214 ms
    72.14.232.215 (72.14.232.215) 1.264 ms
4 209.85.253.141 (209.85.253.141) 1.797 ms
    209.85.253.133 (209.85.253.133) 1.937 ms
    209.85.253.137 (209.85.253.137) 1.408 ms
5 google-public-dns-a.google.com (8.8.8.8) 1.413 ms 1.539 ms 1.481 ms

Honestly I've got to question the measurement that you're taking above,
since in your first (DSL) traceroute it looks like you're actually
seeing higher latency than you are on the second (Uverse) path. Without
being able to actually repeat the traceroute multiple times and verify
that the reading was accurate it's obviously hard to say for certain,
but your numbers look VERY consistent, showing a clear progression with
very little jitter from ~30ms at the first visible hop, to ~60ms at the
Google handoff. If there was really a measurement artifact, you would
expect at least a healthy percentage of those numbers to be
significantly different.

As for the ~17ms jump between Uverse and Google in the second
traceroute, I can't tell for certain without full IPs, but my gut says
that the reverse path might be going back via Ashburn once it hits the
Google side. Remember AT&T is actually composed of classic AT&T,
SBC/AS7132, and Bellsouth/AS6389, each with their own unique routing
policies. The latency jump would be a near perfect fit for there still
being some direct AS7132 peering sessions up, but only in Ashburn and
not Atlanta.

If nothing else, this illustrates one key point of troubleshooting with
traceroute. The actual output of the traceroute is often worthless
without knowing the source and destination IPs that were being tested,
so *ALWAYS* provide those along with your traceroutes if you want to
ever have any hope of having your problem solved.

Paul_WALL · November 12, 2010, 2:04am

The U-Verse infrastructure is a bit of a mess when you get closer to
the end subscriber. There will be a few more L3 hops as your packets
egress the metro area towards what was the legacy BellSouth IP network
(BRIB).

The first few hops will be the U-Verse "LIO" (Intermediate Office),
which serves as your first layer 3 hop. After, you'll end up in the
U-Verse VHO (Video Hub Office), which is where all the IPTV gear and
U-Verse IP aggregation occurs. You'll hop through a few more devices
within the VHO until you end up on a legacy BellSouth IP backbone
device (AS6389). From there you'll then route to the AT&T CBB (AS7018)
and onto a AT&T MIS (IP transit) router where Google is a customer.

The legacy BellSouth ADSL product doesn't have to go through as many
hoops to reach an actual IP network. One thing to keep in mind is that
the BellSouth U-Verse customers are numbered out of classic SBC
(AS7132) IP address space, which is advertised to the Internet
originating from AS7132. I wonder if some of that return traffic is
routing into AS7132 or AS7018 at a sub-optimal location rather than
directly back to that MIS connection in Atlanta.

Another note regarding the latency, you can probably attribute some
that to the Alcatel DSLAM you terminate on. They're known for setting
a static interleaving value on all customers, regardless of line
conditions. Customers should really reach out and ask for this to be a
configurable option, just like AT&T offered it for its legacy ADSL
broadband subscribers.

Drive Slow, but not due to Alcatel interleaving
Paul Wall

William_Pitcock · November 12, 2010, 2:22am

U-Verse is actually the name of two entirely different services - VDSL
and FTTP. This is a typical symptom of stupidity on behalf of marketing
people.

The VDSL service uses interleaving, but since they use actual fibre in
my neighbourhood (I have an ONT on the side of my house and everything)
I can't really tell you what impact the interleaving has.

Friends of mine on VDSL say it's about an additional 20ms penalty or so.
Perhaps it's the interleaving?

If you log into your RG, it will tell you if you are on VDSL or are
connected to an ONT. I think what your case is, is that you are on VDSL
and very close to an IX as far as AT&T's network is concerned.

William