Slashdot: Providers Ignoring DNS TTL?

In your note below you speak of 'moving on to something else' when
PPLB comes.

No, I actually don't.

PPLB destabilizes TCP. It elicits erroneous retransmissions, squanders
capacity and lowers performance.

You are suggesting that we replace TCP in all the computers in the
world?

Nope. I was observing generically that if people who take advantage of
a technology window to supply a supportive technology (video capture
cards for PC's) are smart, the *really* smart people are those who are
prepared to move along to something else when the mainstream catches up
(DV/Firewire) and their product is no longer necessary.

Cheers,
-- jra

I'm trying to sort out the various claims here, since I think right now this is a case of people talking past each other, and arguing completely different points.

First of all, let's ditch the term "PPLB." The usual alternative to per packet load balancing (what's been being talked about here) is per prefix load balancing, which would also be "PPLB." The abbreviation is therefore more confusing than anything else.

Now, onto the argument that's going on here.

Dean says "per packet load balancing is coming," and then goes on to assume it's going to be used in such a way that it will cause packets to route through widely divergent paths.

Several others have responded that that would cause packet reordering and break TCP.

Robert says that even used correctly (on identical circuits between the same set of routers), per packet load balancing can cause packet reordering.

Steiner says that when used correctly, per packet load balancing causes packet reordering only rarely, and speeds things up enough when it doesn't that the slowdowns caused by occasional packet reordering may be worth putting up with.

Robert says well known researchers say that packet reordering is bad.

So, as far as I can tell, everybody except perhaps Dean agrees that:

- Used incorrectly (on divergent paths), per packet load balancing can
   cause packet reordering.

- Used correctly (on non-diverging paths), packet reordering doesn't
   happen often.

- Packet reordering is bad, and should be avoided.

I'm less clear on Dean's position, but I think it's something along the lines of:

"Per packet load balancing over divergent paths is coming, by fiat from marketing departments even if engineers don't like it, and anything that doesn't play well with it needs fixing." While Dean focuses on anycast, that would presumably extend to TCP and to anything jitter sensitive, such as streaming audio or video.

Anything that's being missed here, or does this sum it all up?

-Steve

I think thats a fair summary.

So agreeing for a second with Dean that indeed this behaviour would appear to be
prohibited or at least inconsistent with the RFCs, the fact is anycast is widely
deployed and is proven to be stable.

Perhaps a solution to this is to look at what would be the best consistent view
and to write an RFC to clarify this and obsolete the old ones that produce the
inconsistency. I'm not sure what that would look like but that would appear to
be a way to eliminate the theoretical problem..

Steve

I'm pleased that you noticed... but possibly less so if it means you
*didn't read the clarification I posted on what I actually meant*. :-}

Cheers,
-- jra

Date: Sat, 23 Apr 2005 16:13:22 -0400 (EDT)
From: Dean Anderson

And it violates RFC 1546, as previously explained.

Who cares? You've railed against SMTP+AUTH because it's not a
"standard". Why do you give a rat's rump about 1546?

Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
people will be prepared for it. They dumb people, well, they're dumb.

Perhaps PPLB becomes more common. Time for SACK, lest traditional TCP
do bad things.

As for anycast, there's a fair chance people building anycast clusters
will work around PPLB. Maybe they'll build topologies to avoid
problems. Maybe they'll have behind-the-scenes unicast intelligence to
deal with TCP session transfer.

I'll leave it at that. This thread is getting old, and 1xRTT latency
makes SSH uncomfortable.

What can be expected from dumb people?

Frequent NANOG posting.

Eddy

Date: Sun, 24 Apr 2005 02:00:48 -0400
From: Valdis.Kletnieks@vt.edu

What you seem to be missing is that the *really* smart people will be prepared
for it when it actually gets here - and will take advantage of it's lack of
arrival in the meantime.

Nahhhh.... the code in my lab and the work-in-progress protocol dev
printout to my right exist because I was bored and had nothing better to
do, and Minesweeper bores me. Fortunately, I have discovered posting to
NANOG as a worthy alternative.

Networking can be hard. Let's just say all problems are insurmountable
and go home. Let someone else solve the hard stuff... it's worked great
for spam.

Eddy

> Steinar:
>
> There is a large body of work from competent and well known researchers
> that assert the claim. I certainly lack standing to question their
> results.
>
> Empirically, download speeds to home are nearly cut in half (18Mbps)
> from sources that are subjected to packet reordering along the path.

I'm trying to sort out the various claims here, since I think right now
this is a case of people talking past each other, and arguing completely
different points.

First of all, let's ditch the term "PPLB." The usual alternative to per
packet load balancing (what's been being talked about here) is per prefix
load balancing, which would also be "PPLB." The abbreviation is therefore
more confusing than anything else.

Err. No, that would be worse. "Per prefix" load balancing is an artifact
of the Cisco route cache. The route engine (ie the route table) isn't
queried for every packet. Instead the route in the route cache is used.
One doesn't configure "per prefix" load balancing. One configures load
balancing, which adds multiple routes into the route table. The route
cache then causes only one of these routes to be used. On cisco, to
enable PPLB, you turn off the route cache. On Juniper, you configure it
to put multiple routes in the route table. Its actaully more likely to
happen on Junipers, because unless you configure additonal policies, you
get load balancing on divergent links as well as non-divergent links. On
Cisco, the route cache is controlled on a per-interface basis.

Now, onto the argument that's going on here.

Dean says "per packet load balancing is coming," and then goes on to
assume it's going to be used in such a way that it will cause packets to
route through widely divergent paths.

Almost a correct statement of my position. But if its used in any
case--anywhere--even on internal links, in any condition other than
between exactly two routers, it will cause a problem with Vixie's TCP
anycast.

We aren't seeing a lot of problems yet because few people are using PPLB,
and fewer are using TCP DNS, and so few are using Vixie's TCP anycast.

If we change any our terms in this discussion, it should be to change our
use of "TCP anycast". The behavior described by Vixie for DNS TCP
"anycast" is unapproved** and non-standard. It does not conform to RFC
1546 TCP Anycast. RFC 1546 outlines changes to the TCP stack to support
TCP anycast. To avoid confusion, I propose we call it "vixie-cast". And
I also want to emphasize that RFC 1546 TCP anycast doesn't have any
problems with PPLB. And of course UDP anycast also has no problem. It is
only "Vixie-cast TCP" that has a problem.

** One of the criticisms of DNS Root Anycast is that Vixie recommended
deployment of anycast to root server operators without first discussing
this on the DNSOP WG. Dr. Bernstein first brought it to the WG's
attention in 2002. And one of the recent discoveries is that a good number
of DNS root server operators are using it, and had assumed the Vixie was
the IETF liason communicating with the IETF. The IETF has never approved
of this, and in fact, Vixie's version of TCP anycast is not what is
described in RFC 1546. Vixie's version really shouldn't be called anycast,
since it creates confusion with RFC 1546 recommendations. Indeed, if RFC
1546 recommendations for TCP anycast were widely implemented in TCP
stacks, then there wouldn't be any problem. The problem is caused by
violating RFC 1546.

So, as far as I can tell, everybody except perhaps Dean agrees that:

- Used incorrectly (on divergent paths), per packet load balancing can
   cause packet reordering.

I agree that used on divergent paths, a problem occurs with Vixie-cast.
Additionally, packet reordering can happen, which can cause problems with
TCP stacks that aren't able to handle insertions. This reordering problem
is symptomatic of a badly implement TCP stack. Other things can cause
reordering, besides PPLB.

- Used correctly (on non-diverging paths), packet reordering doesn't
   happen often.

- Packet reordering is bad, and should be avoided.

I'm less clear on Dean's position, but I think it's something along the
lines of:

"Per packet load balancing over divergent paths is coming, by fiat from
marketing departments even if engineers don't like it, and anything that
doesn't play well with it needs fixing." While Dean focuses on anycast,
that would presumably extend to TCP and to anything jitter sensitive, such
as streaming audio or video.

No, that isn't quite it. (While it may indeed come by fiat of marketing,
but I don't know that and don't assert that) But I assert that only
"vixie-cast" has a problem. Streaming video and audio is thought to be
reorder sensitive, but this is the same problem as with TCP reordering,
and has a similar solution. If the jitter buffer can do inserts
efficiently there is no problem. And, of course, the out-of-order packets
must arrive before they are to be played. Some jitter buffers can do this
inserting well, others can't.

Indeed, TCP stacks that properly implement RFC 1546 TCP anycast
recommendations will work without problem. The problem is that
"vixie-cast" doesn't conform to RFC 1546.

Also, the reorder problem only causes a performance problem and never
causes incorrect operation. Vixie-cast causes incorrect operation.

    --Dean

So agreeing for a second with Dean that indeed this behaviour would appear to be
prohibited or at least inconsistent with the RFCs, the fact is anycast is widely
deployed and is proven to be stable.

"vixie-cast" is deployed on around 60 or so root DNS servers. (don't know
the exact number) That covers a wide spread of root DNS servers, but I
wouldn't call that 'widely deployed'. I haven't been able to find any
users of HTTP anycast/'vixie-cast' that Patrick Gilmore referred. There
are also very few TCP DNS queries to the roots, so it isn't widely used at
present, and hasn't been widely used in the past. I don't think it can be
claimed that "vixie-cast" has been proven to be stable. ISC's assertions
of stablity at a 2002 Nanog are what probably brought it to Dr.
Bernstein's attention. Those assertions of stability are what's being
challenged. You cannot assume them true.

Perhaps a solution to this is to look at what would be the best consistent view
and to write an RFC to clarify this and obsolete the old ones that produce the
inconsistency. I'm not sure what that would look like but that would appear to
be a way to eliminate the theoretical problem..

Another solution is to urge OS vendors to implement RFC 1546 TCP anycast.
In order to use RFC 1546 TCP anycast, it is necessary to implement changes
in all clients that might access TCP anycast servers (as well as in the
servers). This would probably require a long time frame, but still good to
encourage. It might be easier to require this for IPV6---though I don't
know that it isn't already required for IPV6.

Another solution is not to do Vixie-cast. This may require clarification
to DNS RFCs to specify that TCP queries will not be made to root DNS
servers. It was previously thought that DNSSEC would require TCP, but
this isn't the case in the latest round of RFC drafts. I can't think of
anything else in the pipe that might require TCP DNS queries to root
servers. Non-root servers usually don't need to do anycast, and aren't
required to do TCP. So one could do anycast without TCP, if one wanted.
But one ought know that anycast'ing DNS precludes TCP DNS.

The "vixie-cast" HTTP doesn't *seem* to be widely in use, and there are
numerous other solutions for HTTP. So simply recommending a halt to that
would seem to be low-impact.

    --Dean

we do this 'http anycast' as part of another service... it seems to work
well enough.

> Date: Sat, 23 Apr 2005 16:13:22 -0400 (EDT)
> From: Dean Anderson

> And it violates RFC 1546, as previously explained.

Who cares? You've railed against SMTP+AUTH because it's not a
"standard". Why do you give a rat's rump about 1546?

Actually, objections to standards in both cases is a consistent position
to have.

But for the record, you misrepresent my SMTP AUTH claims:

I've noted about SMTP AUTH
  that it isn't required (as wrongly claimed),
  that it isn't supported in most mail clients (as wrongly claimed),
  that the MS version isn't draft compliant (as wrongly claimed),
  that it isn't scalable (as wrongly claimed),
  that it doesn't stop spam (as wrongly claimed),
  that it costs more money to operate,
  that it isn't wanted by paying customers.

There's probably more. But this discussion isn't about SMTP AUTH.
Speaking of vindication, do you remember when people (you among them, I
think) told us that if we just did POP-before-SMTP there would be no more
spam? And isn't it strange that open relay abuse dropped to nothing in
2003 just after the open relay blacklists mostly shut? And then only
started back up (lamely) about mid-March of this year as SORBS started
scanning again? And that only open relay blacklists scanned for open
relays? And that only "scanned-by-open-relay-blacklist" relays were
abused? And how many people today would refuse an offer from spammers to
label their spam with an X-spam header (or whatever the IEMCC header was)?
And weren't you among the people who said that the ECPA didn't apply to
email? That anti-trust didn't apply to the internet? That blacklists
weren't subject to laws or courts? That certain blacklists didn't help
spammers? Nevermind, this isn't the discussion for that.

> Well, PPLB isn't the end of the world. But PPLB is coming, and the smart
> people will be prepared for it. They dumb people, well, they're dumb.

As for anycast, there's a fair chance people building anycast clusters
will work around PPLB. Maybe they'll build topologies to avoid
problems. Maybe they'll have behind-the-scenes unicast intelligence to
deal with TCP session transfer.

You really haven't been paying attention: There's no chance of that at
all: It isn't possible to build "vixie-cast" clusters that work around
PPLB. There are no topologies which include diverse paths that avoid
problems.

> What can be expected from dumb people?

Frequent NANOG posting.

There are other symptoms. Like being wrong alot, or being completely
unable to correctly state someone else's position.

    --Dean

> First of all, let's ditch the term "PPLB." The usual alternative to per
> packet load balancing (what's been being talked about here) is per prefix
> load balancing, which would also be "PPLB." The abbreviation is therefore
> more confusing than anything else.

Err. No, that would be worse. "Per prefix" load balancing is an artifact
of the Cisco route cache. The route engine (ie the route table) isn't
queried for every packet. Instead the route in the route cache is used.
One doesn't configure "per prefix" load balancing. One configures load
balancing, which adds multiple routes into the route table.

Modern Cisco routers do not use a "route cache", they use a fully
populated forwarding table. And load balancing is automatic if you have
several equal cost routes.

The route
cache then causes only one of these routes to be used. On cisco, to
enable PPLB, you turn off the route cache.

Many modern Cisco routers can perform per-packet load balancing without
doing process switching (but this needs to be explicitly configured).

On Juniper, you configure it
to put multiple routes in the route table. Its actaully more likely to
happen on Junipers, because unless you configure additonal policies, you
get load balancing on divergent links as well as non-divergent links. On

Modern Juniper routers cannot do per-packet load balancing *at all*. It
is correct that the configuration statement says "per-packet", however
it is really per-flow (and this is well documented). See for instance
the description of Internet Processor II ASIC load balancing at

I'm afraid your statements show a certain lack of knowledge about modern
router architectures.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

[ snip ]

Err. No, that would be worse. "Per prefix" load balancing is an artifact
of the Cisco route cache. The route engine (ie the route table) isn't
queried for every packet.
Instead the route in the route cache is used.

[ snip ]

On Juniper, you configure it
to put multiple routes in the route table. Its actaully more likely to
happen on Junipers,

What?

-J

> > First of all, let's ditch the term "PPLB." The usual alternative to per
> > packet load balancing (what's been being talked about here) is per prefix
> > load balancing, which would also be "PPLB." The abbreviation is therefore
> > more confusing than anything else.
>
> Err. No, that would be worse. "Per prefix" load balancing is an artifact
> of the Cisco route cache. The route engine (ie the route table) isn't
> queried for every packet. Instead the route in the route cache is used.
> One doesn't configure "per prefix" load balancing. One configures load
> balancing, which adds multiple routes into the route table.

Modern Cisco routers do not use a "route cache",

You'll need to define what you mean by "modern" with respect to cisco.
This statement seems to be incorrect.

they use a fully populated forwarding table. And load balancing is
automatic if you have several equal cost routes.

This sounds very much like the Juniper description for the Internet
Processor ASIC behavior. I'd say that's worse.

> The route cache then causes only one of these routes to be used. On
> cisco, to enable PPLB, you turn off the route cache.

Many modern Cisco routers can perform per-packet load balancing without
doing process switching (but this needs to be explicitly configured).

Well, 7500 and 7200 have interface processors that can route packets using
the route cache without interrupting the main processor. So, if you don't
consider 7500's and 7200s to be "modern", this feature above doesn't seem
like a big deal: They could do that before. It was called CEF and DCEF.

> On Juniper, you configure it to put multiple routes in the route
> table. Its actaully more likely to happen on Junipers, because unless
> you configure additonal policies, you get load balancing on divergent
> links as well as non-divergent links. On

Modern Juniper routers cannot do per-packet load balancing *at all*. It
is correct that the configuration statement says "per-packet", however
it is really per-flow (and this is well documented). See for instance
the description of Internet Processor II ASIC load balancing at

http://www.juniper.net/techpubs/software/junos/junos70/swconfig70-policy/html/policy-actions-config11.html#1020787

I don't have Junipers, so I'm just going by what the manual says. And your
link says:

"On routing platforms with an Internet Processor ASIC, when per-packet
load balancing is configured, traffic between routers with multiple paths
is spread using the hash algorithm across the available interfaces. The
forwarding table balances the traffic headed to a destination,
transmitting it in round-robin fashion among the multiple next hops (up to
a maximum of eight equal-cost load-balanced paths). The traffic is
                                                    ^^^^^^^^^^^^^^^^
load-balanced on a per-packet basis."
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

"On routing platforms with the Internet Processor II ASIC, when per-packet
load balancing is configured, traffic between routers with multiple paths
is divided into individual traffic flows (up to a maximum of 16 equal-cost
load-balanced paths). Packets for each individual flow are kept on a
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
single interface."
^^^^^^^^^^^^^^^^

I would gues that since both processers are described, that they are both
still supported, and that probably means that both are widely used.

But, I should qualify that doing PPLB on diverse paths is more likely to
happen on the Internet Processor ASIC. It would seem like the Internet
Processor II ASIC has an architecture more like the cisco 7500s, and only
allows per flow load balancing.

So, I guess it depends on how many people might still be using the
Internet Processor ASIC platforms. And of course, whether this behavior
might be disabled in some future release for the 'II ASIC or if some new
platform might have PPLB on diverse paths.

I'm afraid your statements show a certain lack of knowledge about modern
router architectures.

I'm afraid your statements show a certain lack of knowledge about whats
being used in datacenters to route packets. And perhaps some arrogance
about whats "modern". I'd still call cisco 7500 and 7200 series routers
"modern", and they have route caches. I don't know that much about GSRs,
but they didn't seem to get much traction. I can't say whether they have a
route cache or not. And the multi-rack monster router that cisco just
announced a while back doesn't seem to be too popular either. I don't know
its architecture, either. As I look around datacenters, the 7500 and
7200's and to some lesser extent Junipers are the workhorses doing most of
the routing. And that basic technology will stay around in the enterprise
for many more years to come.

But again, note that RFC 1546 give a rule about internetwork architecture:

   An internetwork has no obligation to deliver two successive packets
   sent to the same anycast address to the same host.

Whether it used to be impossible to utilize this rule, and whether anyone
actually presently uses this rule is irrelevant to the question of what
rules one needs to follow when building anycast systems. RFC 1546 gives
some rules to follow, and they are violated at the peril of the
internetwork.

And "vixie-cast" violates this rule. It imposes the new rule that "an
internetwork must deliver to successive packets sent to the same anycast
address to the same host." And no one has thought much about the
implications of that rule.

Assurances that no one can do PPLB on diverse paths offer no defense to
having violated the design principles given for anycast in RFC 1546.

It is also objectionable to calling something "TCP anycast" that isn't TCP
anycast according to RFC 1546.

    --Dean

The questions of what various routers do now or did in the past is
irrelevant. So, to wrap it up:

RFC 1546 give this rule about internetwork architecture on page 5:

   An internetwork has no obligation to deliver two successive packets
   sent to the same anycast address to the same host.

Whether it used to be impossible to utilize this rule, and whether anyone
actually presently uses this rule is irrelevant to the question of what
rules one needs to follow when building anycast systems. RFC 1546 gives
some rules to follow, and they are violated at the peril of the
internetwork.

TCP "vixie-cast" violates this rule. It imposes the new rule that "an
internetwork MUST deliver to successive packets sent to the SAME anycast
address to the SAME host." And no one has thought much about the
implications of that rule, (other than the original architects of RFC
1546). Sure, it sort of happens most of the time with current routers and
current configurations, but load balancing over diverse paths isn't
limited to being slow and per-flow. There are no IETF rules that require
that behavior. Implementors of networks and routers are free to use the
RFC 1546 design rule.

Assurances that typically, it happens that no one can "deliver two
successive packets sent to same anyast IP address to different hosts" is
no defense for TCP "vixie-cast" having violated the design principles
given for anycast in RFC 1546.

It is also objectionable to calling something "TCP anycast" that isn't TCP
anycast according to RFC 1546.

    --Dean

[ snip ]

> Err. No, that would be worse. "Per prefix" load balancing is an artifact
> of the Cisco route cache. The route engine (ie the route table) isn't
> queried for every packet.
> Instead the route in the route cache is used.
[ snip ]
> On Juniper, you configure it
> to put multiple routes in the route table. Its actaully more likely to
> happen on Junipers,

What?

This wasn't clear. See section on junipers in my message to sthaug that I
just sent.

> Modern Cisco routers do not use a "route cache",

You'll need to define what you mean by "modern" with respect to cisco.
This statement seems to be incorrect.

For someone with so little clue, you sure do manage to talk a lot. :stuck_out_tongue: I'm
not under any delusion that anything I say here will help you understand
anything, but if it helps some other soul than it may be worth it.

Historically, routers were entirely CPU driven beasts. The most basic and
fully functional type of routing lookup (and also the slowest) was known
as process switching, where the destination address was fully evaluated
against the "master routing table" (RIB) on the CPU. The data structure
used to hold this RIB, one of the easiest algorithms used to implement
"longest prefix match" lookups, is known as a PATRICIA (Practical
Algorithm To Retrieve Information Coded in Alphanumeric) tree.

Unfortunately, these full RIB lookups were (and still are) a relatively
slow process. In Cisco speak, a "route cache" is any of a variety of
mechanisms that are used to cache routing lookups, using a mechanism which
is "faster" than a full RIB lookup. Some of these mechanisms include the
"fast cache" (where the most popular destinations are stored in a smaller
cache after the first packet is looked up via process switching), "flow
cache" (where individual layer 4 flows are stored), and "cef" or Cisco
Express Forwarding.

On modern Cisco routers, CEF is the only thing you will find used to
actually do routing lookups. However, CEF is more of a brand name than a
detailed description of how the routing lookup occurs, and the
implementation varies greatly from platform to platform. The only thing
that CEF really describes across all varieties (other than the fact that
you will soon be experiencing its other meaning, the Customer Enragement
Feature :P) is that the route cache will use a "pre-populated" FIB
(forwarding information base).

A FIB is a data structure which exists solely to do routing lookups, and
is created from a normal RIB. A classic software implementation of a FIB
uses a multi-bit trie (mtrie) to fully map the destination next-hops of
the entire address space. This consumes a little bit of memory (several
megabytes), but gives you a data structure which is fully "pre-populated"
and delivers consistant results. Tihs means that there is no mode where
the first lookup can take longer than the rest, the memory usage does not
increase as you look up more destinations, and the number of memory
accesses is roughly the same for all addresses (vs a patricia tree where
the number of access can vary greatly depending on the tree depth).

All of this is historical of course, as "modern" routers do all of their
packet lookups in hardware using designs which look nothing like any of
this. While Cisco does call everything "ip route-cache", in modern routers
the commands are just there for historical compatability.

> they use a fully populated forwarding table. And load balancing is
> automatic if you have several equal cost routes.

This sounds very much like the Juniper description for the Internet
Processor ASIC behavior. I'd say that's worse.

Load balancing has nothing to do with CEF or a FIB persay, the FIB is just
a good spot to slap multiple next-hops. Juniper's implementation uses a
"pre-populated FIB", the same as everyone else, they just do it using a
tree primitive on an ASIC and controlled by a CPU based switch board.

> > The route cache then causes only one of these routes to be used. On
> > cisco, to enable PPLB, you turn off the route cache.
>
> Many modern Cisco routers can perform per-packet load balancing without
> doing process switching (but this needs to be explicitly configured).

Well, 7500 and 7200 have interface processors that can route packets using
the route cache without interrupting the main processor. So, if you don't
consider 7500's and 7200s to be "modern", this feature above doesn't seem
like a big deal: They could do that before. It was called CEF and DCEF.

7200 most certainly does not have interface processors. 7500 does have
processors on the VIPs that do forwarding lookups in a distributed
fashion, but the same procedure for software forwarding apply, there just
happen to be a few more CPUs floating around. DCEF is just CEF plus
copying the FIB structure to the VIPs. And no I don't think any sane
person would consider 7500s or 7200s to be "modern", even though you can
still make use of them.

> > On Juniper, you configure it to put multiple routes in the route
> > table. Its actaully more likely to happen on Junipers, because unless
> > you configure additonal policies, you get load balancing on divergent
> > links as well as non-divergent links. On
>
> Modern Juniper routers cannot do per-packet load balancing *at all*. It
> is correct that the configuration statement says "per-packet", however
> it is really per-flow (and this is well documented). See for instance
> the description of Internet Processor II ASIC load balancing at
>
> http://www.juniper.net/techpubs/software/junos/junos70/swconfig70-policy/html/policy-actions-config11.html#1020787

I don't have Junipers, so I'm just going by what the manual says. And your
link says:

...

I would gues that since both processers are described, that they are both
still supported, and that probably means that both are widely used.

The original poster is entirely correct. The original Internet Processor
is still supported, but it is about as far from "widely used" as you can
get. If you're still using it you have bigger problems. The IP2 is only
capable of doing per-flow load balancing, which is probably a good thing.

But, I should qualify that doing PPLB on diverse paths is more likely to
happen on the Internet Processor ASIC. It would seem like the Internet
Processor II ASIC has an architecture more like the cisco 7500s, and only
allows per flow load balancing.

The IP vs IP2 has no architecture that is or isn't more like the 7500s.
The IP2 is just a newer version of the ASIC on the switch fabric cards.
Classic Juniper architecture (everything pre M320/T series) actually uses
all centralized routing (and other forwarding/filtering) lookups.

> I'm afraid your statements show a certain lack of knowledge about modern
> router architectures.

I'm afraid your statements show a certain lack of knowledge about whats
being used in datacenters to route packets. And perhaps some arrogance
about whats "modern". I'd still call cisco 7500 and 7200 series routers
"modern", and they have route caches. I don't know that much about GSRs,
but they didn't seem to get much traction. I can't say whether they have a
route cache or not. And the multi-rack monster router that cisco just
announced a while back doesn't seem to be too popular either. I don't know
its architecture, either. As I look around datacenters, the 7500 and
7200's and to some lesser extent Junipers are the workhorses doing most of
the routing. And that basic technology will stay around in the enterprise
for many more years to come.

What ever happened to the days when people who didn't understand something
would sit down, shut up, and listen to the people who did? You my friend
don't even have a basic grasp on the old hardware (which is most assuredly
not modern), let alone a firm grasp, let alone any idea how modern (you
know, that stuff made in the last 5+ years) works. Instead of spouting off
about things which you don't understand, you might do well to listen and
learn. Well, that is if you actually want to learn of course. Maybe you
just want to talk. :slight_smile:

And "vixie-cast" violates this rule. It imposes the new rule that "an
internetwork must deliver to successive packets sent to the same anycast
address to the same host." And no one has thought much about the
implications of that rule.

Assurances that no one can do PPLB on diverse paths offer no defense to
having violated the design principles given for anycast in RFC 1546.

It is also objectionable to calling something "TCP anycast" that isn't TCP
anycast according to RFC 1546.

I've successfully managed to tune out this thread until now, so I don't
know what babble I've missed so far, but let me sum it up for you in
no-nonsense terms:

Nothing says that you can't have out of order packets on the Internet
until you are blue in the face. However, it tends to do very nasty
things to the TCP algorithm, which makes it perform poorly. If you are
fine with poorly performing TCP then you go right ahead and re-order your
packets, but I know that I'm not fine with this, nor are my customers or
the vast majority of other Internet users. Therefore, if vendors want to
design a product that end-run the problem by maintaining packet ordering
when they load balance, good for them.

> > Err. No, that would be worse. "Per prefix" load balancing is an
> > artifact of the Cisco route cache. The route engine (ie the route
> > table) isn't queried for every packet. Instead the route
in the route cache is used.
> > One doesn't configure "per prefix" load balancing. One configures
> > load balancing, which adds multiple routes into the route table.
>
> Modern Cisco routers do not use a "route cache",

You'll need to define what you mean by "modern" with respect
to cisco.
This statement seems to be incorrect.

the statement is largely correct -- at least from an operational standpoint.

it is true that IOS still has 'route-cache'-based forwarding and
'flow'-based forwarding schemes (ip route-cache, ip-route-cache flow), BUT
given we're talking about internet routing here, you would defintely want to
be using CEF which isn't a cache demand-populated method.

the distinction between demand-populated forwarding (FIB) versus
prepopulated forwarding tables is relatively straight-forward, as are the
reasons why it is a "good thing"<tm>. of course, hindsight is a wonderful
thing.

> they use a fully populated forwarding table. And load balancing is
> automatic if you have several equal cost routes.

This sounds very much like the Juniper description for the
Internet Processor ASIC behavior. I'd say that's worse.

umm, no, i'd say it "isn't worse".
i can't speak for how J does it (or what methods they may use for
loadbalancing across distributed forwarding hardware and/or multiple
switch-fabric(s)), but in the case of C, the default (per-prefix)
loadbalancing provides deterministic loadbalancing which won't reorder
packets within the same src/dst tuple (tuple could be L3 or L3+L4-based).

> Many modern Cisco routers can perform per-packet load balancing
> without doing process switching (but this needs to be
explicitly configured).

Well, 7500 and 7200 have interface processors that can route
packets using the route cache without interrupting the main
processor. So, if you don't consider 7500's and 7200s to be
"modern", this feature above doesn't seem like a big deal:
They could do that before. It was called CEF and DCEF.

umm, what you're saying is largely orthogonal to what Steinar is saying.
distributed versus centralized forwarding is a different topic of
discussion.

you seem familiar with the methods commonly used to gain per-packet
loadbalancing from about 6 years ago. CEF can provide the same
functionality but without 'process-switching'.

I'm afraid your statements show a certain lack of knowledge
about whats being used in datacenters to route packets. And
perhaps some arrogance about whats "modern". I'd still call
cisco 7500 and 7200 series routers "modern", and they have
route caches.

"best practice" would be to use CEF for pre-populated Forwarding Tables
rather than 'fast-switching' methods which use demand-based population
methods.

cheers,

lincoln.

Date: Sat, 30 Apr 2005 00:57:46 -0400 (EDT)
From: Dean Anderson

But for the record, you misrepresent my SMTP AUTH claims:

Someone needs to put down the crackpipe. At least do a Google search or
three to find out what I really say before putting words in my mouth.
e.g., I specifically cited laws and cases that appear to apply to
blacklists... now you claim I stated DNSBLs are exempt? Someone needs
to put down the crackpipe.

See threads like "ORBS (Re: Scanning)" on NANOG, or "Port 25 Email
blocked by ISP" on cobalt-users. You might find major discrepencies
between what you claim I say and what I really say. Someone needs to
put down the crackpipe.

You object to SMTP+AUTH because it isn't standard:

http://www.merit.edu/mail/archives/nanog/199-11/msg00263.html
http://www.merit.edu/mail/archives/nanog/199-11/msg00289.html

You complain that SMTP+AUTH "doesn't scale"... yet viewing open relay
logfiles for abusers scales?! Someone needs to put down the crackpipe.

Yet you cite RFC1546 as the One True Anycast. Is RFC1546 a standard?
What does its first paragraph say, again?

You really haven't been paying attention: There's no chance of that at
all: It isn't possible to build "vixie-cast" clusters that work around
PPLB. There are no topologies which include diverse paths that avoid
problems.

http://www.merit.edu/mail/archives/nanog/msg07220.html

Read what I said. Did I say "vixie-cast" clusters? Did I specify a
particular topology, or suggest choosing topologies that work? Even
when the thread is is plain sight for all to reference, you fail to cite
correctly. Someone needs to put down the crackpipe.

You claim PPLB over widely diverging paths will become increasingly
common. If that actually happened, guess what would happen to unicast
TCP? Guess what would happen to many UDP-based protocols over unicast?

If you believe that PPLB problems are "vixiecast"-specific, I have a
challenge for you: Connect two routers in series with multiple links.
Run PPLB between them, using different latency/jitter/packetloss over
each link. Do this for your production traffic.

> > What can be expected from dumb people?
>
> Frequent NANOG posting.

There are other symptoms. Like being wrong alot, or being completely
unable to correctly state someone else's position.

You've done a fantastic job of demonstrating both. As much as I'd love
to have another protracted flamefest with you, NANOG is hardly the
place, and I'm putting more priority on real work. Maybe one day my
income will be proportional to how many characters I stuff in NANOG
readers' mailboxes, but right now it's based on providing services.

If a sane discussion of anycast is on-topic[1], I'll join. Barring
that, I'm done posting to this thread.

[1] Moderators? Is that operational enough, or too far in the
    "research" realm?

Eddy

You agreed with me on something? I must have missed that at the time. I'm
*sure* I would have made a note of that.

    --Dean

I would say there are laws that apply to blacklists too; just not the same
laws that you cite. But IANAL, and neither are you.