Seeking UUNET/Level3 help re packet loss between Comcast & Onvoy customers

On August 2, my Craig D. Rice from stolaf.edu posted Re: Seeking Comcast Contact: need to troubleshoot packet loss and/or asymmetric routing issue between Comcast & Onvoy.

A good Comcast contact (thanks!) and some interesting (but, as it turns out, irrelevant) discussion of ICMP and PMTU ensued.

Current situation:

We now know that all packets from all Comcast ranges make it to Carleton (137.22/16) and St Olaf (130.71/16) just fine. But we are seeing 30-80% packet loss on the paths (multiple) from both Carleton and St Olaf to certain Comcast ranges only. The size of the packets does not seem to matter. We pursued the PMTU wild goose, and a few others, based on the observation that SYNs appeared to get through, but data packets did not. In reality, multiple SYNs are also being lost.

Recently, we observed that packets sent from the interior IP address of our border router, 137.22.69.254, are able to get to Comcast fine. But traffic sent from any other source IP address in 137.22/16 fails.

Outbound traceroutes (Carleton to Comcast) show that the "good" networks take an AT&T outbound hop, while the "bad" networks go through level3. Inbound traceroutes (Comcast to Carleton) always seem to go through AT&T.

               137.22.69.254 137.22.69.253
71.63.168.1 good (level3) BAD (level3)
71.63.244.1 good (level3) BAD (level3)
74.19.4.1 good (AT&T) good (AT&T)

Wild card: It might be relevant that Carleton was previously a UUNET customer, and that 137.22.69.254 was an IP address known to UUNET as a demarc point to be monitored. Maybe someone at UUNET failed to clear some filters some years ago, when we shut off our old T1.

Complications: Onvoy has agreed to be sold to Zayo Bandwidth, http://www.startribune.com/154/story/1378026.html; the technician with whom we are working at Onvoy appears to be proud of his A+ certification; and we are in a semi-rural market with no short-term alternative for Internet service, and freshmen arriving in one week. Help!

Here's how St Olaf sees the world. 71.63.168.1 is a "bad" address, > 50% packet loss. 73.112.232.1 is a "good" address, 0% loss.

cisco7200#show ip bgp sum
BGP router identifier 130.71.196.25, local AS number 21951
BGP table version is 31337554, main routing table version 31337554
224927 network entries using 26316459 bytes of memory
245363 path entries using 12758876 bytes of memory
44821/40648 BGP path/bestpath attribute entries using 5557804 bytes of memory
37918 BGP AS-PATH entries using 1026100 bytes of memory
516 BGP community entries using 18954 bytes of memory
1 BGP extended community entries using 24 bytes of memory
0 BGP route-map cache entries using 0 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
BGP using 45678217 total bytes of memory
11380 received paths for inbound soft reconfiguration
BGP activity 1786406/1561479 prefixes, 3731508/3486145 paths, scan interval
60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
137.192.32.173 4 5006 9156558 2056473 31337549 0 0 23w6d 222602
192.42.152.218 4 57 3531028 478950 31337541 0 0 1d10h 11380

cisco7200#show ip bgp neighbors 137.192.32.173 routes | include 13367
* 24.31.0.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 24.118.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
* 24.118.0.0/16 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 13367 13367 i
*> 24.118.64.0/18 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 24.118.128.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 24.118.160.0/19 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 24.118.192.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 24.131.128.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 24.245.0.0/19 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
* 24.245.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 13367 i
*> 24.245.32.0/19 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
* 24.245.64.0/20 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 66.41.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
* 66.41.0.0/16 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 13367 13367 i
*> 66.41.64.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 66.41.128.0/18 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 66.41.192.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 67.178.48.0/23 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
* 67.190.192.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 13367 i
* 67.190.224.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 68.86.46.0/24 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 68.86.200.0/23 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 68.86.232.0/22 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 68.87.14.0/24 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 68.87.174.0/23 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 68.87.176.0/22 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
* 69.180.128.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.240.52.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.240.56.0/24 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.240.212.0/23 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.240.244.0/24 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.240.245.0/24 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.241.77.0/24 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 69.241.114.0/24 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
* 70.89.196.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
* 70.89.200.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 70.90.76.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 70.91.176.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 71.63.128.0/17 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
* 71.193.64.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 71.195.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 73.112.0.0/14 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 73.127.0.0/16 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 73.205.0.0/16 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 73.236.0.0/15 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 74.18.0.0/20 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.19.0.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.24.192.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.30.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.31.224.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.93.24.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.94.80.0/21 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.95.64.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.95.70.0/23 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.95.100.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.95.140.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.146.192.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 74.147.224.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 75.72.0.0/17 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 75.72.0.0/15 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 13367 13367 i
*> 75.72.128.0/17 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 75.73.0.0/17 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 75.73.128.0/17 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 i
*> 75.144.36.0/22 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 75.146.32.0/20 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.17.128.0/17 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.113.128.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.113.160.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.113.192.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.113.224.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.133.0.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.133.0.0/17 137.192.32.173 0 5006 3549 3356
13367 13367 13367 13367 13367 13367 i
*> 76.133.32.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.133.64.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.133.96.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.154.0.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.154.0.0/17 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.154.32.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
*> 76.154.64.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 13367 i
*> 76.154.96.0/19 137.192.32.173 0 5006 7018 13367
13367 13367 13367 13367 i
*> 209.162.0.0/18 137.192.32.173 0 5006 7018 13367
13367 13367 13367 i
cisco7200#show ip bgp neighbors 192.42.152.218 routes | include 13367
*> 24.31.0.0/19 192.42.152.218 110 0 57 13367 i
*> 24.118.0.0/16 192.42.152.218 110 0 57 13367 i
*> 24.245.0.0/18 192.42.152.218 110 0 57 13367 i
*> 24.245.64.0/20 192.42.152.218 110 0 57 13367 i
*> 66.41.0.0/16 192.42.152.218 110 0 57 13367 i
*> 67.190.192.0/19 192.42.152.218 110 0 57 13367 i
*> 67.190.224.0/19 192.42.152.218 110 0 57 13367 i
*> 69.180.128.0/18 192.42.152.218 110 0 57 13367 i
*> 70.89.196.0/22 192.42.152.218 110 0 57 13367 i
*> 70.89.200.0/22 192.42.152.218 110 0 57 13367 i
*> 71.193.64.0/19 192.42.152.218 110 0 57 13367 i
cisco7200#sho ip bgp 71.63.168.1
BGP routing table entry for 71.63.128.0/17, version 31179174
Paths: (1 available, best #1, table Default-IP-Routing-Table)
   Not advertised to any peer
   5006 3549 3356 13367 13367 13367 13367
     137.192.32.173 from 137.192.32.173 (172.30.0.5)
       Origin IGP, localpref 100, valid, external, best
       Community: 328073238
cisco7200#sho ip bgp 73.112.232.1
BGP routing table entry for 73.112.0.0/14, version 31179171
Paths: (1 available, best #1, table Default-IP-Routing-Table)
   Not advertised to any peer
   5006 3549 3356 13367 13367 13367 13367
     137.192.32.173 from 137.192.32.173 (172.30.0.5)
       Origin IGP, localpref 100, valid, external, best
       Community: 328073238

monitoring doesn't involve filters, it involves no-filters... I see the
prefixes you have issues with out L3/Comcast, perhaps there's something
going on their direction?

-Chris

This is resolved, though no one knows exactly why. If someone at Global Crossing has relevant logs of route flaps or somesuch, that might be interesting, but I can live with the mystery.

Comcast advertises a specific route for the "problem" space, 71.63.128.0/17. Don't ask me why. Early yesterday morning, they flapped that route -- stopped advertising it, then started again. *Probably* shortly after that, and possibly as a consequence, our connectivity to the range of Comcast IP addresses typically assigned to residential customers in MN was restored.

Our ISP, Onvoy, has three upstreams: Verizon, Global Crossing, and AT&T.

The route from 137.22/16 and 130.71/16 to Comcast typically took the path

carleton->onvoy->global crossing->level3->at&t->comcast

But the path from Comcast to us was simply

comcast->at&t->onvoy->carleton

For some time, Global Crossing was a mystery hop, because Onvoy was not terribly communicative, and Global Crossing was not showing up in traceroutes. Onvoy has let us know that this was could have been because the glbx/onvoy link was filtering ICMP due to DDoS attacks, but this seems wrong to me on a number of levels. Anyway, it's "fixed."