multihoming without BGP

Of course, the downsides of using the interface-default hack are:

1) it does not guarantee shortest path for the packets (unless someone
has hacked together an lbnamed version that talks to gated and sees
which interface has a shorter path to customer <x> based on number of
AS hops before it answers the DNS query).

Shorter AS paths are a silly way to choose a path for a connection. *IF*
BGP carried all the end-to-end bandwidth and delay stuff that EIGRP does,
it might be possible to make this decision intelligently. But only if all
nets described by a routing element were internally homogeneous -- that is,
only one exit gateway rather than different exit gateways in each region.

None of these requirements hold in the case of BGP. While BGP is a fine
way to route packets, it's a horrid way to select paths for connections.

The right answer, as I said when I first described this to the NANOG list
a while back, is in an upcoming product. "ifdefault" is the free part of
the idea and it's something I hope to see system vendors supporting.

2) It uses a separate address for each interface (not important for a
single box, but a room full of boxes, say, 50 of them, 3-way homed at
a single site... hmm, that's 100 extra addresses you didn't want to
use). I suspect that upstream providers will not be thrilled to hand
out more address space if they discover it is being put to such
inefficient use.

I don't think so. If you are using PA space, then the fact that you might
have to burn 3X as much PA space isn't going to bother any particular P in
the P=3 in your supposition. Listen up folks -- if you can't get routable
PI space, you have to make do with what you CAN get.

But there's no guarantee that you need separate addresses per home page.
If you don't count Lynx or Mosaic as part of your target audience, then you
can depend on the "Host:" keyword sent in queries by *all* modern browsers.

But if you do need to support old Lynx and Mosaic, you can assign all 100
PA's as virtual interfaces on a single "ifdefault" machine.

Remember that the machine with the "ifdefault" hack just runs a squid cache
in accelerator mode. Your web servers are all highly custom and probably
very fragile, you should leave them alone. The "ifdefault" box is just a
front end -- it becomes, or adds a hop before, your exit gateway.

3) I have not looked at the code, but if it is on a per-interface
basis, based on the addresses in the packets, that would seem to
suggest that it might not like BSDI 3.0's virtual host scheme (adding
IP addresses to the loopback port and then proxy-arping them onto the
wire). If this is correct, that would mean you would have to use a
different physical machine for each customer. Of course, on this
point I'm purely speculating.

Indeed you are, sir! The interfaces that matter are the uplink ones, not
the downlink ones. A SYN packet comes in on some interface, and what
"ifdefault" is trying to do is make sure your SYN-ACK goes out to the exit
gateway that's reachable via that same interface. The local end of the
TCP connection is bound to a local socket, we're just trying to get the
"remote" end of each TCP connection bound to a reasonable upstream gateway
rather than having to use a single system-wide default or run full BGP.

4) It puts the onus for fail-over on the DNS server, which means one
is going to be using very short TTL.

People who multihome do that anyway.

5) Unless (#1), (#4) implies that fail-over will be manual. Is your
Emacs ready to rock and roll on 50 zone files?

No it isn't but there are four or five packages in the /contrib subdir of
BIND that can robohack your zone files to this end.

I admire Paul's hack; it is spiffy for what it is, but I would hardly
promulgate it as an advised way to multihome without using BGP.

But, but... I *DID* it. I didn't just write the code (actually I didn't
write much of the code, Ted Lemon wrote most of it) -- I ran this stuff on
a high volume pornography site for three months and the credit card
transaction dollar-o-meter was never as busy or as steady, before or since.

I know it sounds hacky. But so did ethernet's exponential backoff. The
thing that makes this hack work is counter intuitive but the success is
measurable.

That's two folks who have come out today and said "well that's no damn good"
without trying it. I'm surprised, NANOG members usually have a more positive
attitude.

That's two folks who have come out today and said "well that's no damn good"
   without trying it. I'm surprised, NANOG members usually have a
   more positive attitude.

Would it make you feel better if I loaded the code before pointing out
apparent shortcomings?

                                        ---Rob

In general, I agree with what you have been saying these last days, but try
to refrain from mee-too-isms. But ...

While BGP is a fine way to route packets, it's a horrid way to select
paths for connections.

Either this statement is confused, I am, or both. BGP is one way to get
data into forwarding tables so that forwarding engines can route packets.
As you go on to knock BGP for how it makes path decisions, the above
sentence becomes indigestible.

But anyway, the underlying problem is that BGP concentrates on policy, while
good IGPs concentrate on efficient use of paths. An underlying assumption
may have been that ASx can/should not know the internals of ASy.

There have been proposals for BGP modifications and for other EGPs which
address the need for more path optimization in EGPs. But this is NANOG, not
IDR.

That's two folks who have come out today and said "well that's no damn
good" without trying it. I'm surprised, NANOG members usually have a more
positive attitude.

Do you subscribe to a different NANOG list than I?

randy

Paul A Vixie wrote:

But there's no guarantee that you need separate addresses per home
page.
If you don't count Lynx or Mosaic as part of your target audience,
then you
can depend on the "Host:" keyword sent in queries by *all* modern
browsers.

I had to jump in here with a correction and clarification. We
are using Netscape Enterprise Server 2.0. The 'software virtual
server' feature doesn't work with any version of Netscape Navigator
up to and including Communicator Preview 5. The server sees a
HTTP 1.0 not HTTP 1.1 request. In my logs I do see some webcrawlers
sending HTTP 1.1 so it isn't completely a server issue.
Netscape Enterprise 3.0 is coming out RSN but I don't know that it will
be any better. If it is I'll speak up here.

The result is we are, on Solaris 2.5.1, burning an IP address for each
website on the host system rather than using CNAME and software virtual
servers. It also results in a server process running for each website.

As for Lynx, one cannot cavalierly dismiss it. A current Lynx has
to be supported for at least the blind/visually impaired users.
I know one personally.

This does not belong on nanog, but to set the facts straight: if Netscape
Enterprise 2.0 supports Host: based virtual hosts but only for hosts that
make HTTP/1.1 requests that is lame. Versions of Navigator have sent
Host: headers for some time; so has MSIE. Last I knew, AOL's proxies
didn't yet because they were too lazy. The vast majority of clients do
send Host: headers even though they aren't HTTP/1.1. HTTP/1.1 clients
_have_ to send the Host: header, but HTTP/1.0 clients may and most do.

I do not advocate using them yet for most websites, but that is another
issue.