Prefix 120.29.240.0/21

We see a number of session towards downstreams flaps obviously caused by prefix 120.29.240.0/21, originated by AS45158, transited by AS4739 (see below).

Best regards,

Fredy Kuenzler
Init7 / AS13030

#sh ip bgp 120.29.240.0
Number of BGP Routes matching display condition : 4
Status codes: s suppressed, d damped, h history, * valid, > best, i internal
Origin codes: i - IGP, e - EGP, ? - incomplete
     Network Next Hop MED LocPrf Weight Path
*>i 120.29.240.0/21 206.223.143.99 21 150 0 4739 45158 {64512 64514 64516 64519 64521 64522 64525 64526 64528 64529 64530 64535 64537 64538 64541 64542 64543 64544 64545 64546 64547 64548 64549 64552 64553 64556 64557 64560 64561 64562 64564 64565 64566 64568 64569 64570 64574 64575 64576 64577 64578 64580 64582 64583 64584 64588 64593 64598 64599 64601 64602 64605 64610 64611 64620 64621 65397 65398 65470 65471 65472 65473 65474 65479 65480 65484 65485 65490 65502 65505 65511 65514 65523 65524 65528 65534 65609} ?
*i 120.29.240.0/21 206.223.143.99 21 150 0 4739 45158 {64512 64514 64516 64519 64521 64522 64525 64526 64528 64529 64530 64535 64537 64538 64541 64542 64543 64544 64545 64546 64547 64548 64549 64552 64553 64556 64557 64560 64561 64562 64564 64565 64566 64568 64569 64570 64574 64575 64576 64577 64578 64580 64582 64583 64584 64588 64593 64598 64599 64601 64602 64605 64610 64611 64620 64621 65397 65398 65470 65471 65472 65473 65474 65479 65480 65484 65485 65490 65502 65505 65511 65514 65523 65524 65528 65534 65609} ?
*i 120.29.240.0/21 206.223.143.99 21 150 0 4739 45158 {64512 64514 64516 64519 64521 64522 64525 64526 64528 64529 64530 64535 64537 64538 64541 64542 64543 64544 64545 64546 64547 64548 64549 64552 64553 64556 64557 64560 64561 64562 64564 64565 64566 64568 64569 64570 64574 64575 64576 64577 64578 64580 64582 64583 64584 64588 64593 64598 64599 64601 64602 64605 64610 64611 64620 64621 65397 65398 65470 65471 65472 65473 65474 65479 65480 65484 65485 65490 65502 65505 65511 65514 65523 65524 65528 65534 65609} ?
* 120.29.240.0/21 77.67.76.237 1546 150 0 3257 7473 7474 9300 45158 i
        Last update to IP routing table: 0h25m47s, 1 path(s) installed:
        Route is advertised to 1 peers:
         213.144.128.180(13030)

We see a number of session towards downstreams flaps obviously caused by
prefix 120.29.240.0/21, originated by AS45158, transited by AS4739 (see
below).

The same here, but in my case we're downstream itself :slight_smile:

JunOS logs:

Nov 17 01:28:54.955 2010 <gw> rpd[1391]: bgp_read_v4_update:8283: NOTIFICATION sent to <peer> (External AS 22822): code 3 (Update Message Error) subcode 1 (invalid attribute list)
Nov 17 01:28:54.982 2010 <gw> rpd[1391]: Received BAD update from <peer> (External AS 22822), aspath_attr():3055 PA4_TYPE_AS4PATH(17) => 50 times FLAPPED family inet-unicast(1), prefix 120.29.240.0/21
Nov 17 01:28:55.009 2010 <gw> rpd[1391]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer <peer> (External AS 22822) changed state from Established to Idle (event RecvUpdate)

hey,

The same here, but in my case we're downstream itself :slight_smile:

What version are you running?

This seems to hit some versions in even worse way.

After some investigation I can post a summary of the incident.

At appx. 9:33 CET we saw the first flaps, affecting most of our downstreams, with Cisco and Juniper routers. Our backbone, based on Brocade XMR, was not affected, apart from the number of BGP updates which caused some CPU load.

Ironically the incident prefix got picked up by a Cisco edge router, and the session to the peer (AS4739) where the prefix got injected didn't crash either.

We filtered the evil prefix, and then the systems became stable again.

Meanwhile AS4739 shut down the BGP session with the originator AS45158 (thanks MMC).

The propagation itself of the originator is rather uncommon, I'd say, as we can see, it's a BGP confederation of not less than 77 private AS numbers. Don't know for what it should be useful...

We asked some customers what gear they are running, and here is a short compilation - all these systems were affected by the BGP flaps:

- Cisco 2821 - c2800nm-advipservicesk9-mz.124-20.T4
- Cisco 2821 - c2800nm-advipservicesk9-mz.124-24.T1.bin
- Cisco ASR1002F - asr1000rp1-adventerprisek9.03.01.01.S.150-1.S1.bin
- Juniper MX480 - junos 10.0R3.10

We couldn't observe flaps of Quagga. Also not one single iBGP session was affected within our Brocade / Cisco network.

Best regards,

Fredy Kuenzler
Init7 / AS13030

We see a number of session towards downstreams flaps obviously caused by
prefix 120.29.240.0/21, originated by AS45158, transited by AS4739 (see
below).

#sh ip bgp 120.29.240.0
Number of BGP Routes matching display condition : 4
Status codes: s suppressed, d damped, h history, * valid, > best, i
internal Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop MED LocPrf Weight Path
*>i 120.29.240.0/21 206.223.143.99 21 150 0
4739 45158 {64512 64514 64516 64519 64521 64522 64525 64526 64528 64529
64530 64535 64537 64538 64541 64542 64543 64544 64545 64546 64547 64548
64549 64552 64553 64556 64557 64560 64561 64562 64564 64565 64566 64568
64569 64570 64574 64575 64576 64577 64578 64580 64582 64583 64584 64588
64593 64598 64599 64601 64602 64605 64610 64611 64620 64621 65397 65398
65470 65471 65472 65473 65474 65479 65480 65484 65485 65490 65502 65505
65511 65514 65523 65524 65528 65534 65609} ?

After some investigation I can post a summary of the incident.

At appx. 9:33 CET we saw the first flaps, affecting most of our downstreams, with Cisco and Juniper routers. Our backbone, based on Brocade XMR, was not affected, apart from the number of BGP updates which caused some CPU load.

Ironically the incident prefix got picked up by a Cisco edge router, and the session to the peer (AS4739) where the prefix got injected didn't crash either.

We filtered the evil prefix, and then the systems became stable again.

Meanwhile AS4739 shut down the BGP session with the originator AS45158 (thanks MMC).

The propagation itself of the originator is rather uncommon, I'd say, as we can see, it's a BGP confederation of not less than 77 private AS numbers. Don't know for what it should be useful...

It's not a confederation, but an AS_SET. Confederations paths are represented like this:

(65251 65200 65202)

whereas a AS_SET is an unordered list, used for loop detection when aggregating routes. Hmm.. It appears as though remove-private-as does NOT strip out private AS numbers from a AS_SET. Anyone know why you wouldn't want to do that? The route would be filtered by anyone using one of the above ASes, which is exactly why it should be stripped. Anyone have a customer router or running a confederation using one of those ASes which can verify?

We asked some customers what gear they are running, and here is a short compilation - all these systems were affected by the BGP flaps:

- Cisco 2821 - c2800nm-advipservicesk9-mz.124-20.T4
- Cisco 2821 - c2800nm-advipservicesk9-mz.124-24.T1.bin
- Cisco ASR1002F - asr1000rp1-adventerprisek9.03.01.01.S.150-1.S1.bin
- Juniper MX480 - junos 10.0R3.10

Bringing up a earlier thread, we rejected the route because we've set a maxas-limit of 50 and log it so I woke up to find these messages in my email this morning rather than a phone call in the middle of the night:

Nov 17 02:32:46.133 CST: %BGP-6-ASPATH: Long AS path 4323 4739 45158 {64512,64514,64516,64519,64521,64522,64525,64526,64528,64529,64530,64535,64537,64538,64541,64542,64543,64544,64545,64546,64547,64548,64549,64552,64553,64556,64557,64560,64561,64562,64564,64565,64566,64568,64569,64570 received from 207.250.148.153: Prefixes: 120.29.240.0/21

We couldn't observe flaps of Quagga. Also not one single iBGP session was affected within our Brocade / Cisco network.

Best regards,

Fredy Kuenzler
Init7 / AS13030

-James

I think we really need community tool to test BGP implementations against
known/past bugs and unknown (fuzzied) bugs.

Simple script which has two eBGP sessions to network being tested, one
injecting routes and another receiving them should be enough.

There are quite few BGP APIs out there with pretty ready infra, some even
have test infra ready:
https://github.com/jesnault/bgp4r

I'm happy to contribute test cases if such project surfaces, but at the
moment I'm not ready to start the project. Maybe next time my network blows
up due to BGP bug, I'll find time.

Hi,

(forgot list)

4739 45158 {64512 64514 64516 64519 64521 64522 64525 64526 64528 64529
64530 64535 64537 64538 64541 64542 64543 64544 64545 64546 64547 64548
64549 64552 64553 64556 64557 64560 64561 64562 64564 64565 64566 64568
64569 64570 64574 64575 64576 64577 64578 64580 64582 64583 64584 64588
64593 64598 64599 64601 64602 64605 64610 64611 64620 64621 65397 65398
65470 65471 65472 65473 65474 65479 65480 65484 65485 65490 65502 65505
65511 65514 65523 65524 65528 65534 65609} ?

The propagation itself of the originator is rather uncommon, I'd say,
as we can see, it's a BGP confederation of not less than 77 private AS
numbers. Don't know for what it should be useful...

one minor correction here: 65609 is no private ASN, its a reserved one
in ASN32 Space (65609 > 65535, which is 2^16-1).

looking at my junipers sh rou ... detail, it showed me the AS_SET with
AS_TRANS in ASN16_PATH, and AS65609 in ASN32_PATH and ASN-MERGED_PATH.
What surprised me a bit was that AS_TRANS was at the beginning of the
AS_SET, while 65609 was listed at the end of the AS_SET; which may or
may be an issue of presentation only, or may or maybe a problem.

In the end it wouldnt surprise me if one or another implementation
would screw up exactly because of ASN32 here.

my 2c,
-mc

hey,

Looks like this broken update was around from 08:32:15 UTC until 09:47:44 UTC (this matches what we saw):
http://www.ris.ripe.net/dashboard/120.29.240.0/21

Other providers, like Easynet, have also hit news with unexpected trouble this morning. Go do RIS search for 87.80.0.0/13 - lots of prefix unstability with exactly these start and end times. I'm doing some more digging but if any folks with already existing tools (hinthint Renesys :slight_smile: want to help.. Seems it was bit more widespread that first thought.

* Saku Ytti:

I think we really need community tool to test BGP implementations against
known/past bugs and unknown (fuzzied) bugs.

Testing is the easy part. Meeting all the requirements for getting
the fix rolled out on the (relevant parts of the) Internet is
impossible because many ISPs have little experience upgrading their
routers.