CIDR cleanup

John_Von_Essen · October 1, 2020, 1:32pm

Sorry if this is slightly off-topic, but I am writing some code for a custom GeoDNS routemap. My starting data set is a raw list of /24 subnets, no prefix aggregation has been done. In other words, its the entire BGP routing table in /24 prefixes - tagged by Geo region. Each region is its own txt file with a dump of /24’s. As a result, these lists are HUGE. I want to aggregate the prefixes as much as possible to create a smaller routemap.

So right now it looks like:

...
105.170.72.0/24 brs
105.170.73.0/24 brs
105.170.74.0/24 brs
105.170.75.0/24 brs
105.170.76.0/24 brs
105.170.77.0/24 brs
105.170.78.0/24 brs
105.170.79.0/24 brs
105.170.80.0/24 brs
105.170.81.0/24 brs
105.170.82.0/24 brs
105.170.83.0/24 brs
105.170.84.0/24 brs
…

and so on. Obviously, 105.170.72.0/24 thru 105.170.79.0/24 can be aggregated to 105.170.72.0/21 and so on. I normally use Perl, does anyone now if there is a perl module that will automatically do this prefix aggregation? I tried to write my code to do this, and its not trivial, just lookinh for a shortcurt. I did a breif glance at some CIDR related Perl cpan modules, and nothing has jumped out.

Thanks
John

Tim_Jackson · October 1, 2020, 1:44pm

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use NetAddr::IP qw(Compact);

my @ips = ( ‘105.170.72.0/24’, ‘105.170.73.0/24’, ‘105.170.74.0/24’ );

my @agged = aggregate(@ips);

sub aggregate {
my @naddr = map { NetAddr::IP->new($) } @{$[0]};
my @output = Compact(@naddr);
return @output;
}

Jon_Meek · October 1, 2020, 1:55pm

The Perl Net::Netmask module is also worth checking out. It may not be better at aggregation but it does have other functions that could be helpful. I use the shortest match address lookup functions of Net::Netmask very heavily and have reproduced them in a R / C++ package.

Jon

Marcos_Manoni · October 1, 2020, 5:15pm

Hi,

Check https://github.com/job/aggregate6 (thank you, Job)

_Job_Snijders · October 2, 2020, 10:03am

Marco Marzetti (PCCW) wrote an even faster compression tool!

https://github.com/lamehost/aggregate-prefixes

Both these python implementations are meant as replacements for ISC's
vintage 'aggregate' Unix utility, with the notable difference that they
also support IPv6.

Example:

job@bench ~$ pip3 install aggregate-prefix

    job@bench ~$ wc -l dfz_ipv4
    810607
    job@bench ~$ cat dfz_ipv4 | time aggregate-prefixes - | wc -l
    141645
    1m40.17s real 1m37.39s user 0m01.60s system

Compressing the whole IPv4 DFZ prefix list takes only 100 seconds.

Kind regards,

Job

Bandy_Rush1 · October 2, 2020, 10:39am

Marco Marzetti (PCCW) wrote an even faster compression tool!
GitHub - lamehost/aggregate-prefixes: Fast IPv4 and IPv6 prefix aggregator written in Python.

Both these python implementations are meant as replacements for ISC's
vintage 'aggregate' Unix utility, with the notable difference that they
also support IPv6.

ok, i gotta ask. has someone tested to see if they all produce the same
result givem the same input? i do not mean to imply they do not. i
just have to wonder.

randy

_Job_Snijders · October 2, 2020, 10:56am

Yes, of course. Marco and I collaborated on the tool's regression
testing.

  job@bench $ aggregate6 < dfz_ipv4 | md5
  066bfea49c4c20fed7d86d355044764a
  job@bench $ aggregate-prefixes < dfz_ipv4 | md5
  066bfea49c4c20fed7d86d355044764a

  job@bench $ aggregate6 < dfz_ipv6 | md5
  1193796d41cc47f32230da281e3ad419
  job@bench $ aggregate-prefixes < dfz_ipv6 | md5
  1193796d41cc47f32230da281e3ad419

Kind regards,

Job

Bandy_Rush1 · October 2, 2020, 4:33pm

ok, i gotta ask. has someone tested to see if they all produce the
same result givem the same input? i do not mean to imply they do
not. i just have to wonder.

Yes, of course. Marco and I collaborated on the tool's regression
testing.

  job@bench $ aggregate6 < dfz_ipv4 | md5
  066bfea49c4c20fed7d86d355044764a
  job@bench $ aggregate-prefixes < dfz_ipv4 | md5
  066bfea49c4c20fed7d86d355044764a

  job@bench $ aggregate6 < dfz_ipv6 | md5
  1193796d41cc47f32230da281e3ad419
  job@bench $ aggregate-prefixes < dfz_ipv6 | md5
  1193796d41cc47f32230da281e3ad419

great. thanks. glad to see folk thinking this way.

randy

Markus_Weber_FvD · October 2, 2020, 5:27pm

Marco Marzetti (PCCW) wrote an even faster compression tool!
     GitHub - lamehost/aggregate-prefixes: Fast IPv4 and IPv6 prefix aggregator written in Python.
Both these python implementations are meant as replacements for ISC's
vintage 'aggregate' Unix utility, with the notable difference that they
also support IPv6.

Example:

     job@bench ~$ pip3 install aggregate-prefix

     job@bench ~$ wc -l dfz_ipv4
     810607
     job@bench ~$ cat dfz_ipv4 | time aggregate-prefixes - | wc -l
     141645
     1m40.17s real 1m37.39s user 0m01.60s system

Compressing the whole IPv4 DFZ prefix list takes only 100 seconds.

First time I uploaded/publish something to/on Github ... so please be kind:

GitHub - FvDxxx/pfxaggr: Yet another aggregate tool

In case you need it even faster (and can accept the little known issues and that it's old, ugly and never reviewed):

> wc -l dfz-pfx-20201002-A-v4.txt
813542 dfz-pfx-20201002-A-v4.txt

> time cat dfz-pfx-20201002-A-v4.txt | ./pfxagg -a1 > dfz-4-agg-pfx.log

real 0m1.034s
user 0m0.909s
sys 0m0.232s

> time cat dfz-pfx-20201002-A-v4.txt | aggregate-prefixes > dfz-4-agg-pyth.log

real 1m11.691s
user 1m10.879s
sys 0m0.786s

> diff dfz-4-agg-pyth.log dfz-4-agg-pfx.log
> wc -l dfz-4-agg-pyth.log dfz-4-agg-pfx.log
  141754 dfz-4-agg-pyth.log
  141754 dfz-4-agg-pfx.log
  283508 total

Cheers, Markus