6453 routing leaks (January and Today)

From: Jared Mauch <jared@puck.nether.net>
Date: Thu, 24 Feb 2011 16:59:52 -0500

It appears there have been a large number of routing leaks from 6453 today based on my detection scripts that have been running.

(shameless plug for http://puck.nether.net/bgp/leakinfo.cgi)

A quick report of the data show (for today so far) a few thousand of leaks more than is normal for a day like today. I included a snapshot of yesterday below as well.

I've included a more detailed report of the prefixes observed involved here:

http://puck.nether.net/~jared/tata-leak-20110224.txt

This seems to be a somewhat common event for 6453, loking through the history of data available, another event happened on 2011-01-28 as well.

I'm interested in what best operational practices people have employed to help avoid the leaks seen here so I can document them for others to learn to prevent this from happening again.

- Jared

bgp=# select count(blame_asn),blame_asn,asn_responsible from leakinfo where aprox_time::date = '2011-02-24' group by blame_asn,asn_responsible order by 1 desc;
count | blame_asn | asn_responsible
-------+-----------+-----------------
  2208 | 6453 | 6453
   360 | 7473 | 3257
   230 | |
   170 | 17379 | 5511
   130 | 8068 | 3356
    39 | 3225 | 6453
    34 | 45419 | 3356
    26 | 3356 | 3356
    25 | 12180 | 2828
    18 | 22351 | 701
    16 | 7991 | 2914
    16 | 14051 | 1239
    10 | 29571 | 5511
     4 | 32327 | 2828
     4 | 8966 | 2914
     4 | 19080 | 1239
     4 | 30209 | 7018
     4 | 18734 | 701
     4 | 4657 | 3320
     3 | 33748 | 1239
     2 | 5056 | 1239
     2 | 10026 | 2828
     2 | 12252 | 2914
     1 | 11696 | 2828
(24 rows)

bgp=# select count(blame_asn),blame_asn,asn_responsible from leakinfo where aprox_time::date = '2011-02-23' group by blame_asn,asn_responsible order by 1 desc;
count | blame_asn | asn_responsible
-------+-----------+-----------------
   384 | 7473 | 3257
   120 | 17379 | 5511
    48 | |
    27 | 45419 | 3356
    24 | 12180 | 2828
    11 | 23456 | 2914
(6 rows)

bgp=# select count(blame_asn),blame_asn,asn_responsible from leakinfo where aprox_time::date = '2011-01-28' group by blame_asn,asn_responsible order by 1 desc;
count | blame_asn | asn_responsible
-------+-----------+-----------------
  9119 | 6453 | 6453
  2265 | |
   355 | 2914 | 2914
   313 | 7473 | 3257
   250 | 17379 | 5511
   213 | 32592 | 701
   106 | 3790 | 1239
    72 | 19108 | 6461
    62 | 14051 | 1239
    51 | 34977 | 6453
    48 | 31133 | 3356
    47 | 8657 | 174
    32 | 7713 | 2914
    31 | 1257 | 1239
    31 | 8966 | 2914
    30 | 30209 | 7018
    30 | 31133 | 1299
    29 | 8342 | 1239
    24 | 38925 | 3320
    24 | 12180 | 2828
    22 | 8657 | 3549
    21 | 15641 | 3549
    18 | 31133 | 2914
    16 | 15412 | 2914
    15 | 7473 | 3549
    10 | 6762 | 1299
    10 | 6762 | 7018
    10 | 20299 | 1239
    10 | 6762 | 3561
    10 | 6762 | 174
     9 | 4323 | 2914
     7 | 26163 | 6461
     7 | 9505 | 174
     7 | 15149 | 6461
     7 | 9070 | 3549
     7 | 7819 | 6461
     6 | 7473 | 174
     6 | 3216 | 3549
     6 | 1273 | 174
     5 | 8657 | 3356
     5 | 26769 | 3549
     5 | 6762 | 2914
     5 | 6762 | 3356
     4 | 8047 | 701
     4 | 8877 | 174
     4 | 174 | 174
     2 | 20299 | 174
     2 | 7843 | 174
     2 | 7473 | 6453
     2 | 8928 | 3320
     2 | 7991 | 2914
     1 | 1273 | 3549
     1 | 20485 | 2914
     1 | 3216 | 1239
(54 rows)

Can't say if it was a leak or de aggregation, but TATA announcements to
us jumped from about 70,000 to almost 190,000 for a while today, then
dropped back down.

It very much appears to be a leak based on the route-views MRT format updates. There's not a good reason for this observed prefix/path combination:

41.194.32.0/24 | 3549 6453 3356 22351 36939

I don't believe 3549 nor 3356 are buying transit from 6453 to reach each other.

One of the interesting measurements I track (people accuse me of pcaping all bgp updates, which is sorta true with this MRT archive) is the average file sizes of the route-views archive:

http://archive.routeviews.org/bgpdata/2011.02/UPDATES/

This is a good measure of how stable/unstable the network is. You can typically see when a network has performed some grooming or an event like this just by getting a feel for the file sizes. When they go from ~300KiB on average to something in the multiple megs, you know something happened.

- Jared