routing between provider edge and CPE routers

Hi,

I apologize if this has been asked before. I work for an ISP that
started very small (hundreds of T1 and 56k customers) and has grown very
large in the last few years (thousands of T1 customers, as well as DS3
customers and OC3 customers).

We currently use an IGP to route between our distribution routers and
the CPE routers we manage. This has historically worked very well. We
have recently begun running into scalability issues however. We have
some distribution routers that have over 1000 T1 interfaces on them.
This is causing some problems with stability in that edge IGP. Does any
other service provider use an IGP all the way to the customer for non
BGP customers or are we the only one? I have a feeling we maybe are.

If you do use an IGP, have you had any of the scalability issues we have
had? How did you fix them?

If you use statics/BGP to CPE routers have you had any issues doing
that? In particular I'm wondering about the thousands of lines of
configuration used to make static routes work.

Thanks in advance for your advice.

Mike Bernico

So, if customers bounce your IGP churns away? And customers have access to
your IGP data (provided they break into the CPE, which is trivial, eh?)

My recommendation would be for you to:

   o redistribute directly connected interfaces via a strict
     filter into BGP and use iBGP to carry it around the local
     AS

    or

   o use passive interfaces in IGPs to do the same

Avoid having to run a topology computation everytime a T1/56k
links drops. I prefer the first option to the second based on
experience UUNET / Global Crossing has w/ option #1.

  - Serge

Thus spake Mike Bernico (mbernico@illinois.net):

Date: Wed, 29 Jan 2003 12:51:08 -0600
From: Mike Bernico

[ snipped and reformatted throughout ]

We currently use an IGP to route between our distribution
routers and the CPE routers we manage.

I hope I'm misreading. If you're, say, running OSPF between
your edge routers and CPE routers...

This is causing some problems with stability in that edge
IGP.

...I'd imagine so.

Routes within one administrative domain that are preferred over
BGP routes. Yikes. Roguecasting of GTLDs comes to mind as but
one way to do evil deeds.

Does any other service provider use an IGP all the way to the
customer for non BGP customers or are we the only one? I
have a feeling we maybe are.

Anything that depends on proper configuration of customer gear
is inherently evil and dangerous. Of course, nobody ever creates
an ethernet loop, redistributes the wrong prefixes, binds the
wrong IP address, or anything like that, right?

Hopefully I misread. Sharing your IGP with customers is very,
very bad. Dynamic routes also need to be filtered at untrusted
boundaries.

Eddy

Worse yet, any customer which is able to feed routing information to the
backbone (be it any IGP or BGP), unless filtered properly, is able to
trivially create a man-in-the-middle (or trojan horse) attack on systems
protected with plain-text passwords. Simply inject a longer-prefix route
to someone else's network, and then examine (or modify) and bounce the
source-routed packets to the ultimate destination. (Yes, Virginia, source
routing IS evil, and has virtually no legitimate use).

Even supposedly secure things like SSL-protected websites and SSH logins
are vulnerable due to the simple fact that most people won't think twice
to say "yes" to SSH complaining that it detected a new host key; or notice
that they're really talking to a different website (or that the lock icon
is not showing) - if it looks the same, and its URL is similar-looking
(l->1, O->0, etc; and with newish Unicode URLs the fun is unlimited).

So, by accepting routes from CPE you create a huge security vulnerability
for your customers, and other parties. This practice was understood as a
very bad network engineering for decades.

The additional problems created by taking routing information from CPE
are: increased amounts of route flap (because any bouncy tail circuit
or malfunctioning/misconfigured CPE box will cause a flood of routing
updates, potentially killing your entire network), and dramatically
increased incidence of bogus routes (interfering with connectivity of your
other customers, or some third parties).

(I've seen even stupider things - people configuring CPE boxes to
redistribute routes learned from customer's internal LANs! Any compromised
PC, and you're toast).

The solution is:

1) for single-homed sites use static routing, period. Dynamic routing
does not add anything useful in this case (if circuit is down, it's down,
there are no alternative ways to reach the customer's network).

The "convinience" of having to configure only CPE box is no excuse. Invest
some resources in a rather trivial configuration management system, which
keeps track of what network addresses were allocated to which customer,
and produces corresponding bits of router configuration automatically.
Most respectable ISPs did that long time ago. That will also reduce your
tech support costs.

2) for muti-homed sites you have to use routing protocols. Use BGP (_NOT_
IGP!) Implement a strict filtering on all routing updates you get from the
customer. Manage these filters like you manage static routes.

--vadim

PS. They should really require a test in "defensive networking" before
    letting anyone to touch provider's routers...