Slashdot: Providers Ignoring DNS TTL?

Christopher_L_Morro1 · April 23, 2005, 3:44am

oh well, I tried to stay quiet Probably the PPLB problem isn't quite as
simple as: "you have pplb you can't do anycast". I'd imagine that you have
to have some substantial difference in the paths that the PPLB follows,
yes? like links to differing ISP's or perhaps extremely diverse links
inside the same ISP. Correct?

Dean_Anderson · April 23, 2005, 3:55am

>> Or don't. No one here cares if you do. Reality trumps lab tests.
>
> "Reality" for the last ten years has been that no one did either PPLB
> or TCP DNS. That reality is changing. It'll probably start to change
> faster, sooner. Then, users will start to notice the problems.

People have been using TCP applications on anycast for at least a
decade, as I mentioned before. Since DNS responses tend to be very
short lived TCP session, it seems to me that if it works for other
applications (e.g. HTTP), it should work for DNS.

I don't know of any HTTP servers that do anycast. But their failure to
take account of PPLB doesn't change anything. IF they are anycasting under
false assumptions, they'll have problems, too.

Perhaps you should read RFC 1546, which prescribes how to do TCP anycast.
Then note that TCP anycast requires OS support which is not implemented in
any unix-like system (or any system) that I know of. So, instead of
following this prescription, the (DNS) anycast promoters have relied on an
assumption of unique and slowly changing paths to eliminate the
possibility that "two successive TCP segments sent to the
anycast peer might be delivered to completely different hosts." But
PPLB makes that "paths change slowly" assumption false, because it can use
different paths on sequential packets.

From RFC 1546 (page 5)

How UDP and TCP Use Anycasting

   It is important to remember that anycasting is a stateless service.
   An internetwork has no obligation to deliver two successive packets
   sent to the same anycast address to the same host.

   Because UDP is stateless and anycasting is a stateless service, UDP
   can treat anycast addresses like regular IP addresses. A UDP
   datagram sent to an anycast address is just like a unicast UDP
   datagram from the perspective of UDP and its application. A UDP
   datagram from an anycast address is like a datagram from a unicast
   address. Furthermore, a datagram from an anycast address to an
   anycast address can be treated by UDP as just like a unicast datagram
   (although the application semantics of such a datagram are a bit
   unclear).

   TCP's use of anycasting is less straightforward because TCP is
   stateful. It is hard to envision how one would maintain TCP state
   with an anycast peer when two successive TCP segments sent to the
   anycast peer might be delivered to completely different hosts.

   The solution to this problem is to only permit anycast addresses as
   the remote address of a TCP SYN segment (without the ACK bit set). A
   TCP can then initiate a connection to an anycast address. When the
   SYN-ACK is sent back by the host that received the anycast segment,
   the initiating TCP should replace the anycast address of its peer,
   with the address of the host returning the SYN-ACK. (The initiating
   TCP can recognize the connection for which the SYN-ACK is destined by
   treating the anycast address as a wildcard address, which matches any
   incoming SYN-ACK segment with the correct destination port and
   address and source port, provided the SYN-ACK's full address,
   including source address, does not match another connection and the
   sequence numbers in the SYN-ACK are correct.) This approach ensures
   that a TCP, after receiving the SYN-ACK is always communicating with
   only one host.

ianai · April 23, 2005, 4:13am

People have been using TCP applications on anycast for at least a
decade, as I mentioned before. Since DNS responses tend to be very
short lived TCP session, it seems to me that if it works for other
applications (e.g. HTTP), it should work for DNS.

Its funny how I give you TWO conditions, and you ignore one of them. I'll
try to use little tiny baby words:

Well, I can set up "conditions" where anything you try to make work does not work.

TCP Anycast does NOT work with PPLB (Per - packet - load - balancing)
Say it slowly several times.

How about I don't say it at all.

Here, say this several times slowly: "If you use a standard phone cable between your NIC and the wall, it won't work." So why do you keep trying to not use anycast since I have arbitrarily decided that when you do not use anycast you must use a phone cable in your NIC?

What? You don't want to use phone cables in your NIC? Strange, I don't want to use PPLB with my anycast setup, but you seem to think that is a condition of anycast. Which is about as intelligent as forcing you to use a phone cable in your NIC. (Actually, I bet many people here would think forcing you to a phone cable in your NIC would be intelligent....)

Isn't it interesting how sane^Wexperienced engineers can figure out networking basics like not using _per_packet_ load balancing on an application which might use TCP. If you study hard, maybe someday you will be able to figure these things out too.

ianai · April 23, 2005, 4:21am

Been happening for many years. How do you think the original Boardwatch / Keynote speed tests were gamed? If you have any real experience on the Internet, you are well acquainted with anycast web servers.

Okie, I give up. You clearly have no idea what you are actually talking about, so talk away, no one is listening. I started talking to you 'cause I was having a bad day and it's fun to feed the troll. It's not fun any more, and I'm sure others are tired of the thread.

One last comment (although I doubt you will understand): Reality trumps... well, you.

Haesu · April 23, 2005, 4:29am

Remember that anycast configuration does not always require upper layer
applications to specifically support "anycast featureset." It can be done
in a setup similar to those currently being done with stateless/DNS, where
it is dependent of how you want to route your packets to anycast listener
address.

Just make sure your routing between anycasting nodes and requesting node
can actually deliver a clear picture, and it shouldn't be much of an issue
for the majority

-J

Steve_Gibbard · April 23, 2005, 5:51am

For anybody who's confused by this thread, this is a quick explanation, after which I'm really hoping the thread will die:

The "PPLB" Dean mentions is "per packet load balancing" in which you have two or more circuits, and packets to the same destination alternate which circuit they go down. In every case in which I've seen this used, it's been to combine multiple circuits taking the same path between the same pair of routers, to in effect create a bigger circuit. In theory, PPLB could also be used to split traffic between circuits going to different routers, perhaps even in different places. I've never seen anybody actually use the latter setup, and it seems to be universally regarded as something that would break things. I suppose it's possible that somebody's using it somewhere, probably with "interesting" results. It's the latter, theoretically possible, setup that Dean is talking about.

Anycast is a technique in which two or more servers, generally in different locations, announce the same address space. Those sending traffic into a network via one POP or exchange point will have their traffic go to the server close to that entry point, while those sending traffic into a network via another POP or exchange point will have their traffic go to the server close to that point. To an outside network, it looks the same as regular peering -- you see the same route at each peering point and can hand off traffic. The only difference is that the packets may not have to travel as far once they enter the other network.

So, just as a fun theoretical exercise, let's examine what happens in the PPLB to multiple locations scenario that Dean imagines:

Let's say somebody is in the Midwest, and has T1s to Network A and network B. And let's say that their network administrator read on the NANOG list that per packet load balancing was the trendy thing to do, so they turn on per packet load balancing between the two T1s. Now they want to send some packets to a unicast host on network C, somewhere in California.

They start with UDP DNS queries, each consisting of a single packet. Half go via network A, which peers with Network C in California. Responses come back with a 40 ms RTT. The other half go through network B, which has its closest peering point with Network C in Virginia. The packets go to Virginia and then to California, and the replies come back 80 ms later. Everything works fine.

Then they try to set up a more persistent connection, and again half their packets are taking the 40 ms path while the others are taking the 80 ms path. Now things get interesting, because the packets are arriving out of order. Some applications may do ok with this, since they'll take the sequence numbers and reorder the packets, with some buffering and processing delay. But remember, the latency amounts here are numbers I just made up, and there's no reason why it couldn't be 40 ms vs. 1 second in some parts of the world. In either case, I suppose it's possible that you'd get an HTTP connection to sort of work, and an ssh session might just seem mildly painful. But good luck getting a VOIP call or anything of the sort to function over such a connection.

Dean is correct that this setup would fall apart even further when anycast is thrown into the mix. In the anycast example, Network A hands off the packets to Network C in California, where they get sunk into a local server. Network B hands off the packets to Network C in Virginia, where they get sunk into a local server. Each server only sees half the packets, and half the retransmits, and is probably never going to get enough of the connection to put it all back together in a way that works.

So, there are a couple of different conclusions that could be drawn from this. The conclusion I come to is that there are enough problems doing per packet load balancing on non-identical paths that nobody would actually do it. I'm made more comfortable in this conclusion by having been through this discussion several times without finding anybody who claims to actually do that sort of per packet load balancing. I, therefore, declare the PPLB thing to be a non-issue.

It may also be valid to declare that PPLB over non-identical paths is important to allow people to use every last bit of bandwidth they're paying for, and that we shouldn't make their already painful predicament worse. But that's an argument I continue to be skeptical of.

Stephen_J_Wilcox1 · April 23, 2005, 9:22am

Ok gotcha, and you point seems valid except aiui the previous post was
concerning providers who are actually overriding the TTL eg your zone has a 5m
ttl, the provider caches it but sets TTL to 10 days.

i think this thread forked quite early

Steve

Robert_M_Enger · April 23, 2005, 7:29pm

Per Packet Load Balancing is not TCP friendly. (this discussion is orthogonal to DNS)
PPLB leads to packet reordering.

Quite a few empirical and theoretical papers have been published (in peer reviewed fora and elsewhere)
that discuss the negative consequences of packet reordering. A Google search finds many references.

On the downside for those attempting to maximize use of their circuits:
packet reordering can lead to unnecessary retransmissions (squandering capacity).

On the downside for users:
packet reordering can lead to lower performance.

Macroscopically:
There is some movement (finally) towards providing the consumer
with higher speed access to the Internet. (e.g. FIOS 30Mbps and other FTTH and vDSL services).
Consumer adoption of such services would result in an upsurge of traffic:
the need for larger backbones, enhanced server farms, more acolytes to service all of it.
Adoption (and consequent resurgence of the Internet industry) will fail
if consumers do not actually obtain improved performance from their new higher speed connections.

PPLB only benefits those who are milking the last available profits out of a decaying industry.
It is not a forward looking approach.

Dean_Anderson · April 23, 2005, 8:13pm

Been happening for many years. How do you think the original
Boardwatch / Keynote speed tests were gamed? If you have any real
experience on the Internet, you are well acquainted with anycast web
servers.

Gaming speed tests sounds pretty rare. It doesn't appear that Akamai does
this, but maybe I'm wrong. But it would depend on having unique paths.
And it violates RFC 1546, as previously explained.

Okie, I give up. You clearly have no idea what you are actually
talking about, so talk away, no one is listening.

Yes, _You_ clearly aren't listening. Which is the problem.

Your cannard of "Its been done (in an archaic environment)" has no bearing
on anything. The whole point is that environment is changing, and so
hacks that used to be done, hacks that even RFC 1546 anticipated and
warned against, won't continue to work in the future.

I'm reminded of the arguments in the late 80's about threading: People
(like you) said there are no multithreading operating systems, and
multiprocessor systems existed only in labs. So designing threadsafe
libraries or writing multithreading capable languages was a total waste of
time. And they showed as evidence all the programs written from 1975 to
1985.

One last comment (although I doubt you will understand): Reality
trumps... well, you.

Reality trumps alright. But you won't understand that. "Past performance
is no guarantee of future performane" Let me guess: You're one of those
people who won't be concerned about global warming until they need waders
to walk around Manhatten at high tide. Then you'll go "Gee, where'd all
this water come from? Why didn't someone say that the ice caps were
melting?"

Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
people will be prepared for it. They dumb people, well, they're dumb.
What can be expected from dumb people?

Haesu · April 23, 2005, 9:09pm

[ snip ]

Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
people will be prepared for it. They dumb people, well, they're dumb.
What can be expected from dumb people?

With proliferation of high speed circuits and continuing trend of lower
cost of bandwidth, PPLB is becoming more obsolete for many people that
implemented it in the past. What is becoming more common is non-PPLB
based setup, such as flow or destination based load balancing, et al
to group high capacity circuits (i.e. when gig-e's aren't enough and
10GbE is too much of a capex for an immediate upgrade).

-J

Alex_Yuriev2 · April 23, 2005, 9:23pm

Been happening for many years. How do you think the original
Boardwatch / Keynote speed tests were gamed? If you have any real
experience on the Internet, you are well acquainted with anycast web
servers.

Let me, let me, let me! It involved an err... locating linux boxes with the
identical IP addresses given to Boardwatch/Keynote in locations very close
to location of the probes. Finally, for an extra "speed up" it involved
modifying kernels to spew packets back as fast as possible single the only
thing that those boxes did was tricking keynote.

Alex

Niels_Bakker · April 23, 2005, 10:29pm

* haesu@towardex.com (James) [Sat 23 Apr 2005, 23:10 CEST]:

With proliferation of high speed circuits and continuing trend of lower
cost of bandwidth, PPLB is becoming more obsolete for many people that
implemented it in the past. What is becoming more common is non-PPLB
based setup, such as flow or destination based load balancing, et al
to group high capacity circuits (i.e. when gig-e's aren't enough and
10GbE is too much of a capex for an immediate upgrade).

Exactly. Apparently it's little bother for router vendors to reuse the
algorithms they wrote to properly support 802.3ad (link aggregation; the
spec demands that `conversations' be kept on the same wire) for load
balancing over multiple IP paths.

Regards,

-- Niels.

Valdis_Kletnieks · April 24, 2005, 6:00am

I'm reminded of the arguments in the late 80's about threading: People
(like you) said there are no multithreading operating systems, and
multiprocessor systems existed only in labs. So designing threadsafe
libraries or writing multithreading capable languages was a total waste of
time. And they showed as evidence all the programs written from 1975 to
1985.

Odd, seeing how IBM's OS/360 supported multithreading in the mid-60s (well, OK,
only the MVT variant did it really well - MFT had some restrictions, and PCP
was basically a program loader on steroids), as did Multics, early Unix, the
various PDP-8/11 and DEC-10/20 operating systems, and most supported
multiprocessor systems before 1970.

What you're actually talking about is the "I don't have to worry about *THAT*"
syndrome that's always been the bane of program portability. Those of us who
were around at the time remember all too well "Not all the world's a VAX" when
programs that ran fine under BSD on a VAX would bomb out under SunOS 3.2 -
because the VAX allowed dereferencing a NULL pointer and SunOS didn't.

And anyhow, you're looking at it totally backwards - things like system libraries
didn't support multithreading well at first because nobody was *interested* in
doing it. The support did happen once there was an actual demand for it.
Remember that there's a *cost* to supporting multithreading - you have to drag
along all this ugly locking code and stuff like that. It's really hard to
justify putting in code that slows down the 95% of the applications that are
single-threaded for the 5% that are multi-threaded, and even harder to justify
putting the support in the library "just in case somebody wants to use it in
the future".

Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
people will be prepared for it. They dumb people, well, they're dumb.
What can be expected from dumb people?

What you seem to be missing is that the *really* smart people will be prepared
for it when it actually gets here - and will take advantage of it's lack of
arrival in the meantime.

Bill_Stewart · April 24, 2005, 8:35am

> Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
> people will be prepared for it. They dumb people, well, they're dumb.
> What can be expected from dumb people?

There are a variety of things that don't like PPLB, notably IPSEC.
One problem is that if packet lengths aren't constant, you can get
out-of-order delivery,
and some protocols don't deal with that very well, as well as the load
not really getting as perfectly balanced as proponents like to think.
(There's also the problem that
some popular small routers implement PPLB by burning too many of the CPU cycles
that they don't have enough of, so you've got to consider the tradeoffs between
buying more network connections vs. a bigger router.)

Valdis_Kletnieks · April 24, 2005, 9:07am

The fact that a variety of things (like PMTU Discovery) don't like it when
people block all ICMP doesn't stop it from happening. Similarly for a number
of other less-than-perfect-ideas-deployed-anyhow in the past. Why should we
expect PPLB to be any different?

sthaug · April 24, 2005, 9:24am

> Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
> people will be prepared for it. They dumb people, well, they're dumb.
> What can be expected from dumb people?

What you seem to be missing is that the *really* smart people will be
prepared for it when it actually gets here - and will take advantage
of it's lack of arrival in the meantime.

I agree with another poster in this thread - I think we will see *less*
PPLB in the future, not more. Mostly because other, better methods of
bundling several parallel links are more easily available now than they
were a few years ago.

Example: At my previous employer we sometimes used PPLB on 2 or 4 2 Mbps
links to give the customer 4 or 8 Mbps available *for one session*. We
could have used multilink PPP - but that required more expensive routers.
These parallel links always ran from one PE router to one CPE router.

At my current employer we would either use DSL equipment (which bundles
the necessary links at a level below and invisble to IP), or Ethernet
over SDH (using GFP etc). In both cases the bundling of several parallel
links is invisible to IP, and there is no issue of packet reordering.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Jay_Ashworth · April 24, 2005, 5:43pm

And, the even more important extension of your last comment, they'll be
prepared to move on to something else, when it comes, and they can no
longer take advantage of it's absence.

Cheers,
-- jra

Robert_M_Enger · April 24, 2005, 8:00pm

Jay:

In your note below you speak of 'moving on to something else' when PPLB comes.

PPLB destabilizes TCP. It elicits erroneous retransmissions, squanders capacity and lowers performance.

You are suggesting that we replace TCP in all the computers in the world?

Bob

sthaug · April 24, 2005, 8:23pm

In your note below you speak of 'moving on to something else' when
PPLB comes.

PPLB destabilizes TCP. It elicits erroneous retransmissions,
squanders capacity and lowers performance.

I would actually dispute this. I agree that PPLB will *occasionally*
lead to out-of-order packets, which will lead to lower TCP performance
*when it happens*. To many customers this is acceptable as long as PPLB
gives them improved performance *most of the time*. And this is what we
saw very clearly at my previous employer - PPLB worked very well, and
gave clearly increased performance, *most of the time*.

As mentioned in another message, I don't really believe PPLB is coming.
Instead I believe PPLB is something which is probably being *less* used
now than a few years ago, since other link bundling methods are more
easily available now (than they were a few years ago) - and these link
bundling methods occur at a layer below TCP, and are invisible to TCP
(no packet reordering problems).

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

Robert_M_Enger · April 24, 2005, 8:40pm

Steinar:

There is a large body of work from competent and well known
researchers that assert the claim. I certainly lack standing to question their results.

Empirically, download speeds to home are nearly cut in half (18Mbps) from sources
that are subjected to packet reordering along the path.

More to the point however, I note that Jay is the author of RFC2100.
I think he's just having a little bit of fun. My apologies for belaboring the performance issue.

Bob