Router / Protocol Problem

I normally would not post to the group, but I am 100% stumped and have talked with peers with no luck.

I have (2) Cisco 7204 Routers running BGP with 3 peers and HSRP. I am not doing anything special with BGP, pretty much a default config that has not changed in years.

Recently with no changes to my network, I have been having problems connecting to certain websites and mail servers. I am always able to ping the sites and trace route without error. If I telnet to port 80 or port 25 it does not connect. If I login to my router and telnet sourcing my each of Internet Providers ports, I am able to get to the sites. I have talked with all the providers and none can find a problem. If I shut down one specific peer, everything works fine. So I keep thinking it was that peers problem some how. I have tested with just that peer up and I still can not connect. However, when talking with that peer, they are able to telnet from their network to the sites I can not reach. I don’t know what else to check besides shutting down that peer. Which since it is under a 3 year contract, not an option. That isn’t the real solution anyhow.

Can anyone shed some light on or off-list?

Thanks,

Mike Walter

Give your peer a /32 to install on their access router, verify that return path
is via them and have them do connectivity tests to your problem sites.

If that checks out you step by step through it. Ask to be moved to a different
access router, next change your hardware.

/Tony

Please provide details on both your default config and the hardware you're using. You say you have two Cisco 7204s - are these straight '04s, or 7204VXRs? What NPE(s) are you using, and how much memory is on them?
The BGP you're getting from your peers - are you getting full routes from any of them? Do you have CEF enabled on these routers? What IOS version(s) are running on these routers? What else are they doing besides slinging BGP routes? Does the problem go away for a while if you reboot one router or the other?

Without knowing any of this, it sounds like you might have NPE-225, -300, or -400 with 256 MB of RAM and you are running into memory exhaustion issues from carrying full routes. That's been a pretty popular topic on this list and others like cisco-nsp in the last 12 months :slight_smile:

At a minimum, what do the output of "show mem summary" and "show ip bgp sum" from each router show you?

Have you seen other performance problems lately, such as things getting mysteriously slower, beyond the rachability issues you mentioned above?
If so, check if CEF is still running (if it was configured in the first place). When a 7200 gets dangerously low on free memory and CEF is running, it may cannibalize the IP CEF process to try to conserve memory.
Earlier 12.0 releases did this - I don't know if newer ones still do it.

jms

Does your peer or you have any ACLs on the PtP link which may be dropping the packets? If your peer is doing uRPF and doesn't have your route properly installed it can cause problems on their edge.

Are the sites you cannot reach akamaized? I've had issues with some akamaized sites when I was being redirected to akamai servers that weren't on my network. Do a dig on the website and see if it returns an akamai server

Is there any packet loss/CRC errors on the link to your peer? A noisy line will affect large packets more than small packets, I've had issues where only the text/CSS of a website would come up but the images would not.

Any MTU issues? Same as above, MTU issues causing large packets to get dropped and no images on websites.

Pings, traceroute,telnet all work in those cases

-Matt