Traffic Engineering (fwd)

Perhaps you could explain to me how you can find the
shortest path between A and B using ping times, traceroute
hop counts, and AS_PATHS observed at C, assuming that
traffic between A and B is not exchanged through C?

You're not trying to find it between A and B. A connects to B, but B has
every intention of redirecting to C1, C2, or C3, etc.

If B can communicate with some measurement tool on Cx, they can all report
back on their "connectivity" to A. B can then redirect A to the "best" Cx
node. If the results are inconclusive, round robin it. If B had knowledge
of BGP and AS's, maybe it wouldn't have to decide on distance but instead
search for a Cx which was nearest to B in the AS_PATH. The premise being
it may be better to pick a Cx nearest me in my upstream, as opposed to
driving it through some overloaded exchange point.

Granted, ping tends to get dropped on the floor at overloaded points, and
traceroute probably isn't a) too much better b) probably blocked in some
(most) cases, but are there any better alternatives? Anyone know of some
reasonably available methods for measuring end to end "performance" which
are almost universally implemented?

Kind of making do with what we have.

#Included junk

From Network World

Olympic effort could be much more

Eric Germann <ekgermann@cctec.com> writes:

>
>Perhaps you could explain to me how you can find the
>shortest path between A and B using ping times, traceroute
>hop counts, and AS_PATHS observed at C, assuming that
>traffic between A and B is not exchanged through C?
>

You're not trying to find it between A and B. A connects to B, but B has
every intention of redirecting to C1, C2, or C3, etc.

The originally proposed idea was that search engines would
present a list of hits sorted by the proximity to the
client that made the query.

What you are doing is rehashing the IBM scheme.

Unfortunately the IBM scheme requires

  * an initial connection to a server
        
  * each server must know the location of its
          replicated data

  * each server must be able to determine the
    proximity of its replicated data to the
          client in reasonable time and without
          placing too much additional load on the
    server, the client, or the replicating sites

Point (1) means that for short transactions, preserving
locality is a net loss. If it is more work to redirect
than to serve directly, there is no point in not serving
directly.

Point (2) means that copies of the data that may be nearer
the client may not be known about. This is also a scaling
problem.

Point (3) is a difficult engineering challenge, not least
because determining proximity (assuming proximity is a
function of bandwidth, delay, and infrastructure) without
inducing traffic load is hard.

What is really needed is a scheme such that

  * the client learns the identity of the datum
    desired, and the ultimate, authoritative source
    of the datum, through user interaction (typing
          in a URL) or through a search database

  * the client or its proxy asks increasingly distant
    infrastructure if replicas are available; Van
     Jacobson suggests this is a good application for
    multicast and I'm inclined to agree somewhat

  * the client or its proxy retreives the found
    replicated datum if a replica exists and is
    found within a reasonable time or scope of
    multicast search

    or
    
          the client or its proxy turns to the ultimate
    source for a copy

  * should the client or its proxy hear a query for
    this same datum, it will offer it up for
    retrieval to other clients

    that is, everything in your client or proxy's
    local cache may be served up to nearby clients

Note that the squid caching system emulates this scheme to
a large degree.

The difficulty is now reduced to constructing a spanning
tree for all multicast queries and determining whether
something found by the client or its proxy to be
topologically nearby is a good choice for serving up the
data. These problems seem much more tractable than the
ones the IBM scheme needs to solve, and a great deal of
theoretical work has gone into this sort of model already.
  

Anyone know of some reasonably available methods for
measuring end to end "performance" which are almost
universally implemented?

No. See the CAIDA or IPPM efforts for why not.

  Sean.

P.S.: Curtis Villamizar had another interesting approach
      which involved pushing content far afield to
      machines with the same transport-layer (IP)
      addresses, relying upon closest-exit routing to
      connect one to the topologically-closest replication
      machine. Unfortunately, while this could be really
      cool for NSPs to offload stuff towards peering
      points (public or private), it also has some poor
      scaling properties and is uncomfortably reliant
      upon the stability of routing.

      If he's done any more thinking about the idea,
      I'd love to hear about it though.