Digex transparent proxying

Cool, Karl. Then you would be denying your customer access
to a market -- a potentially large market -- without telling them.

Even only indirectly countable "views" are good for brand managers
and marketers, hence the popularity of such things as outdoor advertising.

Ironic. Where can they sue?

  Sean.

Cool, Karl. Then you would be denying your customer access
to a market -- a potentially large market -- without telling them.

Excuse me?

My customer is the one who would be putting such a thing on the web
server that we host for them.

So now people don't get to choose who can and cannot see their content?

Excuse me once again?

The person denying their customer access is the one putting these caches in
place and giving their customer no choice to go around them by STEALING
their packet flows.

The proper response to that is for the people who have the right to determine
how, and by who, their content is viewed, to deny those people access to that
content unless they can determine who is viewing the content, how often it is
being viewed, and that the content being viewed by those people is actually
correct and up-to-date because it is coming directly from their servers.

Let the market sort it out.

Even only indirectly countable "views" are good for brand managers
and marketers, hence the popularity of such things as outdoor advertising.

Not if you can't count them at all! A transparent proxy cache reports
nothing back to the originating site, ergo, those "views" are lost and
never reported, even by inference.

Outdoor signage is in fact reportable, because you can survey the traffic
going past a given point and get the count that way.

Ironic. Where can they sue?

  Sean.

You, for being stupid beyond words.

Oh, I forgot - being stupid and twisting people's words is now considered
a protected class in the United States.

Go take your Lithium Sean, you forgot your pill this morning.

The proper response to that is for the people who have the right to determine
how, and by who, their content is viewed, to deny those people access to that
content unless they can determine who is viewing the content, how often it is
being viewed, and that the content being viewed by those people is actually
correct and up-to-date because it is coming directly from their servers.

If the web-designer "understands" how caching actually works then this in
the other issues you raise are not really issues Karl. HTTP Cache-Control
Headers work wonders when actually used. Caching and Proxying are out
there and being actively used whether transparent or not - it's simply how
it is - a web designer should guarantee their stats and validity and
freshness of their data by using HTTP headers correctly.

Not if you can't count them at all! A transparent proxy cache reports
nothing back to the originating site, ergo, those "views" are lost and
never reported, even by inference.

Why would you want to rely on the proxy for accuracy - would you bill
advertisers by someone else's accounting methods? No - you would take
steps and measures to ensure that your's were not circumvented by a cache
or proxy. Usually that means you talk to your content provider and make
sure they are parsing your meta tags on the server correctly so that some
of your content will be dynamic to any cache or proxy that they will
encounter on the way to any end user on the planet.

You, for being stupid beyond words.

Oh, I forgot - being stupid and twisting people's words is now considered
a protected class in the United States.

Go take your Lithium Sean, you forgot your pill this morning.

Cmon Karl can't we all just get along?

The proper response to that is for the people who have the right to determine
how, and by who, their content is viewed, to deny those people access to that
content unless they can determine who is viewing the content, how often it is

> being viewed, and that the content being viewed by those people is actually
> correct and up-to-date because it is coming directly from their servers.

If the web-designer "understands" how caching actually works then this in
the other issues you raise are not really issues Karl. HTTP Cache-Control
Headers work wonders when actually used. Caching and Proxying are out
there and being actively used whether transparent or not - it's simply how
it is - a web designer should guarantee their stats and validity and
freshness of their data by using HTTP headers correctly.

And as soon as people doing advertising actually do this, then the proxy
becomes less useful, leading proxy owners to ignore the headers so that their
multi-thousand-dollar investments in these things are not wasted and
actually HURT performance (performance for the FIRST fetch through a proxy
is SLOWER - it HAS TO BE, since the proxy must first get the data before it
can pass it on).

> Not if you can't count them at all! A transparent proxy cache reports
> nothing back to the originating site, ergo, those "views" are lost and
> never reported, even by inference.

Why would you want to rely on the proxy for accuracy - would you bill
advertisers by someone else's accounting methods? No - you would take
steps and measures to ensure that your's were not circumvented by a cache
or proxy. Usually that means you talk to your content provider and make
sure they are parsing your meta tags on the server correctly so that some
of your content will be dynamic to any cache or proxy that they will
encounter on the way to any end user on the planet.

And how do you guarantee that the proxy server is parsing the tags and not
ignoring them?

See, that's the problem.

Proxies are fine WHERE CUSTOMERS HAVE AGREED TO THEIR USE.

STEALING someone's packet flow to force it through a proxy is NOT fine.

And as soon as people doing advertising actually do this, then the proxy
becomes less useful, leading proxy owners to ignore the headers so that their
multi-thousand-dollar investments in these things are not wasted and
actually HURT performance (performance for the FIRST fetch through a proxy
is SLOWER - it HAS TO BE, since the proxy must first get the data before it
can pass it on).

If a proxy owner ignores expires heades than as I said he/she/it better
understand what is going on and what he/she/it is doing - they are
potentially causing harm to their end users. Data on most caches is fed
as it comes in - in other words as soon as it has the item it writes to
disk and serves it to the end user. There will more than likely be *some*
added latency to the transaction (but we're talking in the milliseconds
for normal transactions). However, subsequent fetches from the cache for
that data will be considerably quicker in quite a few more instances.

And how do you guarantee that the proxy server is parsing the tags and not
ignoring them?

Hold on hit the brakes there Karl ma boy - I DID NOT SAY - that the proxy
server will parse the tags - most can and do NOT parse the tags - that
would slow them down to a crawl while they waste valuable resources
parsing html - What I said is to make sure that your content provider (who
is serving your (the site designer's) site parse html on teh SERVER - so
that the cache/proxy will see it as an appropriate HTTP header - then you
have no problem

See, that's the problem.

Nope see above...

Proxies are fine WHERE CUSTOMERS HAVE AGREED TO THEIR USE.

Yup you have to agree to use a proxie (it requires you to set it in your
browser) a transparant cache is another story - and IMHO TRANSPARENT
caches have their place closer to the enduser - they can be a problem if
placed to far up the ladder.

STEALING someone's packet flow to force it through a proxy is NOT fine.

huh? NEUMAN! you got me there!!!

Karl Denninger writes:

(performance for the FIRST fetch through a proxy
is SLOWER - it HAS TO BE, since the proxy must first get the data before it
can pass it on).

It doesn't have to -- think cut-thru switching. I've no idea if
any/enough of the proxies do this.

> And as soon as people doing advertising actually do this, then the proxy

becomes less useful, leading proxy owners to ignore the headers so that their
multi-thousand-dollar investments in these things are not wasted and
actually HURT performance (performance for the FIRST fetch through a proxy
is SLOWER - it HAS TO BE, since the proxy must first get the data before it
can pass it on).

If a proxy owner ignores expires heades than as I said he/she/it better
understand what is going on and what he/she/it is doing - they are
potentially causing harm to their end users.

A proxy owner who doesn't ignore them is going to see less and less impact
as time goes on, because less and less web content is both static and not
frequently updated.

> And how do you guarantee that the proxy server is parsing the tags and not
> ignoring them?

Hold on hit the brakes there Karl ma boy - I DID NOT SAY - that the proxy
server will parse the tags - most can and do NOT parse the tags - that
would slow them down to a crawl while they waste valuable resources
parsing html - What I said is to make sure that your content provider (who
is serving your (the site designer's) site parse html on teh SERVER - so
that the cache/proxy will see it as an appropriate HTTP header - then you
have no problem

My language was imprecise.

What I was referring to was the headers indicating whether or not content
can be cached, and also what the expiration time is.

Proxy caches are only useful to the extent that a reasonably-significant
amount of traffic IS referred to more than one, is not LOCALLY cached in the
browser (ie: the model is two or more people accessing the same thing BEFORE
the timeout happens) AND the content has correctly set the proper expiration
policy.

That is becoming less and less true over time.

> Proxies are fine WHERE CUSTOMERS HAVE AGREED TO THEIR USE.

Yup you have to agree to use a proxie (it requires you to set it in your
browser) a transparant cache is another story - and IMHO TRANSPARENT
caches have their place closer to the enduser - they can be a problem if
placed to far up the ladder.

I have no problem with a proxy where the user has agreed to its use.

I have a major problem with a provider hijacking a stream of data and
reprocessing it, then delivering what the client *thinks* is a direct
communication session.

In fact, I'm not even sure that such an act, absent consent, is legal.

Proxies are fine WHERE CUSTOMERS HAVE AGREED TO THEIR USE.

STEALING someone's packet flow to force it through a proxy is NOT fine.

I think this is the heart of Karl's argument. (Karl, feel free to correct
me if I'm wrong.) The rest of the rant about how transparent caches, proxy
server, etc. work and other opinions about how the Internet and web content
will look in the future is ... not my concern at present.

But the original topic is of great concern to me. Is there one person on
this list - even someone from DIGEX - who can give me one reason why
altering the destination of a packet a customer paid you to deliver,
without that customer's consent or foreknowledge, is in any way morally or
ethically permissible? Hell, for that matter, is it even legal?

I know that when my downstreams pay me for transit and give me a packet, I
do my damnedest to get that packet TO THE DESTINATION. If I can give my
customers better service though proxy or caching or any other method, I
will definitely OFFER it to them. (We are currently looking into
transparent and other caching techniques, but have not begun such an
offering as of yet.) However, I will not shirk my responsibility to
deliver packets where the customer (rightfully) expects them to go without
the customer's permission. I find it repugnant that one of my peers has
done so. I would be interested in how other's feel about it - without all
the discussion about whether caching is any use or not.

Karl Denninger (karl@MCS.Net)| MCSNet - Serving Chicagoland and Wisconsin

TTFN,
patrick

>(performance for the FIRST fetch through a proxy is SLOWER - it HAS TO BE,
>since the proxy must first get the data before it can pass it on).

It doesn't have to -- think cut-thru switching. I've no idea if
any/enough of the proxies do this.

Yes, some do.

And I wanted to give one thing Sean said more air time so folks won't just
gloss over it:

An intercepting proxy which runs a modern TCP stack and which
avoids the "herds of mice" problem by aggregating multiple
parallel connections into single ones, and which is well-located
to avoid frequent fifo tail-drop at the last hop, has a benefit
to the ISP that outweighs the cache hit:miss ratio.

That is, a cache which imposes decent long-haul TCP behaviour
reduces the number of packets which are delivered all the way
from the web server to the terminal server but tail-dropped there
rather than being delivered to the end user.

This is rather important, both because the stacks used in last-mile devices
(such as the Microsoft Program Loader) are not very modern, and because HTTP
persistent connections end up not being very helpful until they are aggregated
across multiple last-mile devices.

And in these senses (cut through with segment reblocking; reasonable/modern
TCP handling; more use of persistent HTTP), someone with tcpdump actually
would be able to tell that my particular transparent caching box was in use.
I'll leave the challenge to Karl open, though, since I meant specifically
"be able to tell from the client or server end" not "be able to tell using
tcpdump on a LAN wire upstream of the caching box".

Thank you Patrick.

Proxies are fine WHERE CUSTOMERS HAVE AGREED TO THEIR USE.

STEALING someone's packet flow to force it through a proxy is NOT fine.

I think this is the heart of Karl's argument. (Karl, feel free to correct
me if I'm wrong.) The rest of the rant about how transparent caches, proxy
server, etc. work and other opinions about how the Internet and web content
will look in the future is ... not my concern at present.

Proxies not only intercept and redirect packets, they replace packets with
older ones, rather and allowing a fresh packet to come through. There are
many circumstances where this is unacceptable.

Most contracts imply raw packet streams, unless specified otherwise.
Filtering a raw packet stream is technically a breach of contract. If done
to us, it will cause us to switch upstream providers, make us renumber our
hosts, and cause us much grief/anxiety/emotional harm/lost business, which
we will be glad to bill back to the upstream provider, in court if need be,
at inflated values if we can get away with it.<grin> If our upstream
provider is not the one directly doing it then *they* can forward our bill,
tagging on their own expenses, to their upstream provider, and so on. By
the time this little shit-ball hits the one doing the filtering, they may
decide that sipping umbrella-drinks, on the beach, or collecting welfare,
may be a better business model to persue.

But the original topic is of great concern to me. Is there one person on
this list - even someone from DIGEX - who can give me one reason why
altering the destination of a packet a customer paid you to deliver,
without that customer's consent or foreknowledge, is in any way morally or
ethically permissible? Hell, for that matter, is it even legal?

It can be considered simple contract breach (see above, I was not being
facitious) with appropriate penalties for "willful failure to perform", aka
fraud, possibly wire-fraud under the right circumstances. There's a whole
range of civil and criminal law that are specifically designed to extract
pounds of flesh, out of such perpetrators.

I know that when my downstreams pay me for transit and give me a packet, I
do my damnedest to get that packet TO THE DESTINATION. If I can give my
customers better service though proxy or caching or any other method, I
will definitely OFFER it to them. (We are currently looking into
transparent and other caching techniques, but have not begun such an
offering as of yet.) However, I will not shirk my responsibility to
deliver packets where the customer (rightfully) expects them to go without
the customer's permission. I find it repugnant that one of my peers has
done so. I would be interested in how other's feel about it - without all
the discussion about whether caching is any use or not.

Agreed, I would offer such a value-added service, but not at the expense of
a raw data-feed.

Now, now, Karl...

Happens I agree with you, but that doesn't make Sean a moron.

Cheers,
-- jra

And in these senses (cut through with segment reblocking; reasonable/modern
TCP handling; more use of persistent HTTP), someone with tcpdump actually
would be able to tell that my particular transparent caching box was in use.
I'll leave the challenge to Karl open, though, since I meant specifically
"be ableto tell from the client or server end" not "be able to tell using
tcpdump on a LAN wire upstream of the caching box".
--

And more and more per IP subscription services will need pages like:
http://www.iop.org/cgi-bin/checkup for their customers to debug all these
transparent proxies that may or may not be in the path.

Paul Vixie
La Honda, CA "Many NANOG members have been around
<paul@vix.com> longer than most." --Jim Fleming
pacbell!vixie!paul (An H.323 GateKeeper for the IPv8 Network)

Hank Nussbacher