Links on the blink - reprise

The jury is still very much out on whether layering IP on
top of frame relay with frame relay switches providing the
principal long-haul switching fabric and routers making
decisions about which PVCs to direct IP datagrams into is
a reasonable, scalable approach.

There are several advantages to the approach, vis.
greater port density, especially in terms of
ports-per-dollar on most FR devices than on high-end
Ciscos; the (perhaps safe) bet that a medium-size user and
channel of routers might have more of an influence over
the development and engineering of an FR vendor's
Internet-oriented products than over Cisco; that an
organization with expenditure limits on bandwidth and
relatively long minimum term committment lengths on
cross-country circuits could do better traffic management,
particularly in the presence of multiple parallel DS1s,
than a similar setup using either inverse multiplexors or
load balancing; and the fact that using a mesh along these
lines is something that readily can interoperate with a
migration to clear-channel DS3s or other technologies, or
from a melting-down cross-country ethernet-over-ATM or
FR-over-ATM backbone.

As well, there is the probability that the users of the
model you're asking about have lots and lots of
points-of-presence used in an effort to reduce backhaul
costs, which get quite substantial as the numbers of
customers grow. Also, having lots of POPs interconnected
at varying bandwidths can make designing a topology that
is friendly to routers a bit more difficult in some ways.

The model generally also very neatly hides the underlying
topology from the IP layer, rendering traceroute
essentially useless as a means of diagnosing weird time
asymmetries and the like. There may be 13 physical hops
between A and B on a network using this model, but
traceroute might reveal only two. This has some obvious
advantages to the provider keen on hiding the design of
its network from "outsiders", or even from its own

Key to the decision is the gamble that the switching
ability of FR switches will compare with the switching
ability of high-end routers in the same environment. This
is not a gamble I am comfortable with, particularly as I
have very strong ideas of how well an SSE-equipped 7000
can switch packets in a backbone environment, and so far
have only early impressions about one variety of popular
FR switch in a similar environment.

Also key to the decision is a bet on how well the
FR-switch-based networks will scale beyond multiple DS1s
and DS3s, especially in comparison to routers.

There are substantial disadvantages, too.

In order to take advantage of the greater
port-density(-per-dollar) on FR switches right now, the
end user has to use FR, which is not always practical
or desirable.

The monitoring and management systems of all FR switching
products with which I am familiar are not
router-geek-friendly. PVCs can do strange things, as can
underlying circuits, which one really wants the
IP-switching layer to recognize and adapt to; at present
there is effectively no practical means to do this in
anything but a failure of a PVC.

There is a serious management nightmare in maintaining
anything approaching a full mesh in a large network, and
it worsens if one pushes the routing decisions right out
to the edges of a given network. Maintaining lots of PVCs
has many of the well-known disadvantages of maintaining
lots of standard iBGP neighbours or any other system in
which involves configuration length, complexity or both
approaching or equivalent to N**2.

There are a number of other deeply technical questions
about the model which remain the subject of occasionally
heated speculation, but the bottom line is that nobody
really knows how well it will work as this type of network
grows really big, or even as heavily trafficked as
SprintLink and InternetMCI.

Finally, pushing the envelope of technology is hard
work at all times; I believe a pragmatic approach is
to push only n things at any given time where n is
ideally the smallest possible number. Adding frame
relay switching technology into the mix of things that
backbone NSPs have to deal with already strikes
me as asking for trouble.

PSI and UUNET have apparently been following this strategy with their
backbones for some time. Netcom has also come on board.

Your information may be better than mine, but I gathered
that UUNET has just recently migrated over to this model
from the previous MFS-supplied technologies they had been using.

Frankly, neither netcom nor PSI does sufficient traffic
to learn anything interesting about how the model behaves
under stressful conditions.

Alternet's experience over the next few months will be enlightening.