Cache-as-cache-can

To cut a long story short, I was just wondering if people could extrapolate
their feelings regarding commerical Web Cache solutions. In terms of the
good, the bad and the ugly.

At the moment I'm left with 'two goods' Network Appliance's NetCache and
Inktomi's Traffic Server and would appreciate some input to sort them out.

-Michael

In article <19981116200208.A27675@magnet.at>,
  Michael Haba <m.haba@magnet.at> wrote;

} To cut a long story short, I was just wondering if people could extrapolate
} their feelings regarding commerical Web Cache solutions. In terms of the
} good, the bad and the ugly.
}
} At the moment I'm left with 'two goods' Network Appliance's NetCache and
} Inktomi's Traffic Server and would appreciate some input to sort them out.

I don't think there is major difference in caching, but some
appliance have some problem in their routing. NetCache learns
it from icmp redirect or rip1 which is quite poor for redundant
network topology. I don't know about CacheFlow quite well, but
it seems to speak rip1 at most. I believe they should speak
ospf at least, if they run in a large isp. As for inktomi,
there may not be any routing problem since it runs on some unix
boxes, but it cannot handle so many connections simultaneously
like NetCache or CacheFlow.

We performed a number of tests with most all the vendor's
products. Considering that most all the vendor's are represented on this
list, I do not want to get into a p*ssing contest on a public list as to who
has the best cache.

We found important a number of items to consider with Web Caching.

1) Accuracy. If the served page is not the requested page, your
customers will let you know.

2) Transparency. As long as your customers don't care it's there then
everything is okay. When a Web Cache hangs up and black holes your Web
traffic, that's a bad day. There are some layer 4 switches out in the
marketplace (Foundry and Alteon) who redirect Web requests and run port
80 keepalives.

3) Availability. You don't want a sub-tera Web Cache subject to a single
disc failure. Layer 4 switches also take care of your Web Cache cluster
keeping flows and content contiguous.

4) Performance. There are a number of factors including max sessions,
access speed and a few others to consider. Bottom line is that Web
performance is a perceived service and is often subjective.

5) Also realize that Web Caches do interesting things in switched
networks where 60-75% of your traffic belongs to a single IP address.
Equal cost IGP metrics avail nothing.

6) The goal of Web Caching is not to reduce line utilization. We found
some cases where line utilization maxed when we added the Web Cache.
Having a 100Mbps box proxying all Web traffic can seriously expand TCP
Windows as opposed to a typical modem which introduces serialization delay.

Our goal was to give better international Web performance to customers
while also making the most intelligent use of the international resources
by removing redundant information. Interestingly enough, we also have
seen a reduction of 20% TCP retransmissions to less than 2% on the core.

-eric

Why use things like this, use a default route to a HSRP address ...

/Jesper

In article <19981117091400.D9778@skriver.dk>,
  Jesper Skriver <jesper@skriver.dk> wrote;

} Why use things like this, use a default route to a HSRP address ...

That could be. But they (appliance venders) haven't shown
it at this point, afaik. They simply shows a veiw of cache
appliance and l4 switch sitting between 2 routers. Why?

Eric Dean <edean@gip.net> writes:

Our goal was to give better international Web performance to customers
while also making the most intelligent use of the international resources
by removing redundant information. Interestingly enough, we also have
seen a reduction of 20% TCP retransmissions to less than 2% on the core.

This point is very, very important in most international ISP situations.
The problem comes, in our experience, from customers with routinely
saturated lines. The effect of "thousands of nibbling mice" (to
paraphrase Sean Doran), meaning lots and lots of concurrent sessions from
dialup or slow LAN users, means that normal TCP backoff doesn't work well.
We can add CAR on the interface to maximize utilization, and save our
router buffers, but the drops and retransmissions are still there.

As expected, packets are lost at our border router facing such customers.
However, for the data coming in from an international source (on an
unsaturated international link), the retransmissions cause double traffic
on the international line. Adding the cache effectively displaces the
source of retransmissions from the (overseas) origin web server to the
(local) cache server.

We see traffic reductions due to both the caching effect, which can
be significant, and due to the displacement in retransmissions. The
customer still sees the same level of packet loss, since his line is
still overloaded, but that traffic is served from the local cache and
thus does not need to come over the expensive international link.

-jem

See http://www.data.com/issue/981107/crisis.html to learn more about
Dacom BORANet

Jesper Skriver wrote:

> In article <19981116200208.A27675@magnet.at>,
> Michael Haba <m.haba@magnet.at> wrote;
>
> } To cut a long story short, I was just wondering if people could extrapolate
> } their feelings regarding commerical Web Cache solutions. In terms of the
> } good, the bad and the ugly.
> }

Why use things like this, use a default route to a HSRP address ...

Because you move your single point of failure back to the cache ethernet
interface. Not a great tragedy if you've got a decent keep-alive from your l4
switch, but you lose all caching. Many of the applicances have dual ethers, but
none (to the best of my knowledge) have implemented a failover. My preference is
also for a load balance between two ethers, with failover on fault detection - but
for the moment I'd be real happy with a simple failover.

Note that this does not apply to the platform-based systems (e.g. Inktomi).

daniel rothman

A related issue is correctness. From what I have seen of a product from a
certain large router vendor, it does really stupid things like does not
allow any persistent connections, breaks HTTP/1.1 chunked connections
(well, actually worse: caches the chunked response so HTTP/1.0 clients
that get the cached response see garbage) because they are too
(lazy|dumb|rushed|etc.) to read a RFC, etc.

You need to be _VERY_ careful when evaluating transparent proxies to see
if the implementor actually knows anything at all about HTTP.
Unfortunately, this can require you have a better than average knowledge
of HTTP to begin with.

I just really have trouble with "transparent proxies". The concept is bad
(magically messing with traffic, that may not even be HTTP but may simply
be sent over port 80 to get through filters, without the user being able
to do a thing about it), and the implementations that I have seen are bad.
The real long term solution is to provide better mechanisms where clients
can automatically use a proxy if they should.