new BGP hijack & visibility tool “BGPalerter”

Dear NANOG,

Recently NTT investigated how to best monitor the visibility of our own and our subsidiaries’ IP resources in the BGP Default-Free Zone. We were specifically looking how to get near real-time alerts funneled into an actionable pipeline for our NOC & Operations department when BGP hijacks happen.

Previously we relied on a commercial “BGP Monitoring as a Service” offering, but with the advent of RIPE NCC’s “RIS Live” streaming API 1 we saw greater potential for a self-hosted approach designed specifically for custom integrations with various business processes. We decided to write our own tool “BGPalerter” and share the source code with the Internet community.

BGPalerter allows operators to specify in great detail how to distribute meaningful information from the firehose from various BGP data sources (we call them “connectors”), through data processors (called “monitors”), finally outputted through “reports” into whatever mechanism is appropriate (Slack, IRC, email, or a call to your ticketing system’s API).

The source code is available on Github, under a liberal open source license to foster community collaboration:

https://github.com/nttgin/BGPalerter

If you wish to contribute to the project, please use Github’s “issues” or “pull request” features. Any help is welcome! We’d love suggestions for new features, updates to the documentation, help with setting up a CI regression testing pipeline, or packaging for common platforms.

Kind regards,

Job & Massimo
NTT Ltd

Excellent, now I don't have to write it myself. Looking forward to testing. Thanks for sharing the fruits of your labor with the community.

Kind regards,
Eric

This is great. Will be testing this later in the day. We like a lot of others were using BGPMon.

Job,

I appreciate the effort and the intent behind this project, but why should the community contribute to an open source project on GitHub that is mainly powered by a closed source binary?

Ryan

Hi,

You can build it yourself, see
https://github.com/nttgin/BGPalerter#more-information-for-developers

I think that the binaries are here for thoses that don’t want to install
all the build-chain.

Hi Ryan, Alarig,

> I appreciate the effort and the intent behind this project, but why
> should the community contribute to an open source project on GitHub
> that is mainly powered by a closed source binary?

You can build it yourself, see
https://github.com/nttgin/BGPalerter#more-information-for-developers

I think that the binaries are here for thoses that don’t want to install
all the build-chain.

Indeed, the binary files in the github repository in the 'bin/'
directory are merely provided as a convenience service so interested
people don't need to compile the software themselves in order to run
tests. This project is 100% open source.

At some point in the future ready made binaries should move to a
different place, for example perhaps we can distribute packages through
the PPA mechanism for debian/ubuntu. It would be cool if we get to the
point where one can install the software by simply issuing a command
like "apt install bgpalerter". Help with packaging is most welcome! :slight_smile:

Kind regards,

Job

This looks like fun!
(a few questions for the RIPE folk, I think though below)

What is the expected load of streaming clients on the RIPE service? (I
wonder because I was/am messing about with something similar, though
less node and js... not that that's relevant here).

I hadn't seen the ripe folk pipe up anywhere with what their SLO/etc
is for the ris-live service? (except their quip about: "used to run in
a tmux session I had to occassioanlly ssh into <foo> and restart when
<foo> rebooted" I believe the end of that quip in Iceland was: "and
now its' running as a real service")

Also, one of the strengths to the 'monitoring as a service' folks is
their number of collection points and breadth of ASN to which they
interconnect those points/ RISLive, I think, reports out from ~37 or
so RIPE probes, how do we (the internet) get more deployed (or better
interconnection to the current sets)? and maybe even more
imoprtantly... what's the right spread/location/interconnectivity map
for these probes?

thanks! for showing what's possible with tooling being developed by
like minded individuals :slight_smile:

-chris

Hi,

This looks like fun!
(a few questions for the RIPE folk, I think though below)

What is the expected load of streaming clients on the RIPE service? (I
wonder because I was/am messing about with something similar, though
less node and js... not that that's relevant here).

One of the (IMO) most useful features is that you can filter what you
want to receive. In fact this makes the service useful :slight_smile: So unless you
want to tune in to a significant portion of BGP chatter, the load should
not be substantial.

I hadn't seen the ripe folk pipe up anywhere with what their SLO/etc
is for the ris-live service? (except their quip about: "used to run in
a tmux session I had to occassioanlly ssh into <foo> and restart when
<foo> rebooted" I believe the end of that quip in Iceland was: "and
now its' running as a real service")

It's in between those. We now have a conscious setup which should also
be able to scale up, but bits and pieces (like full monitoring of the
service) are still being developed.

Also, one of the strengths to the 'monitoring as a service' folks is
their number of collection points and breadth of ASN to which they
interconnect those points/ RISLive, I think, reports out from ~37 or
so RIPE probes, how do we (the internet) get more deployed (or better
interconnection to the current sets)? and maybe even more
imoprtantly... what's the right spread/location/interconnectivity map
for these probes?

RIS Live provides data from RIS, which has a bunch of collectors around
the world (see
https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-peering-policy)
with many hundreds of peering sessions. But it is by no means complete
in terms of coverage.

If and how the community (NANOG or RIPE or else) should work on optimal
data collection is indeed a useful discussion to have.

Cheers,
Robert

I think Chris's question is more "Is RIPE going to be OK if a lot of people ask
for the extra-chatty feed?"

Yes, good point. Of course it could be a problem if too many clients ask
for too much data like a full feed... we're not prepared to provide that
on a large scale. For the moment we're looking at the effects of what
people need and if we can handle it with what we have built.

Robert

Hi,

> This looks like fun!
> (a few questions for the RIPE folk, I think though below)
>
> What is the expected load of streaming clients on the RIPE service? (I
> wonder because I was/am messing about with something similar, though
> less node and js... not that that's relevant here).

One of the (IMO) most useful features is that you can filter what you
want to receive. In fact this makes the service useful :slight_smile: So unless you
want to tune in to a significant portion of BGP chatter, the load should
not be substantial.

yup, I can see a usecase clearly for: "This is my prefix set, and my
transit-as-set, tell me when there are deviations" (which is probably
2 different connections with 2 different filters to the not-fire-hose
feed - oh the docs say you can provide more than one filter, ok...
cool)

The firehose is perhaps more friendly for folk like an ISP that could
offer some form of monitoring for their customer's prefixes?
It's also useful (to me anyway) to tell me: "I see prefix-A picked up
a new Origin? odd?" or "Wow, someone 7007'd themselves!"

which isn't clearly (to me anyway) simple to do in the 'not firehose'
version of the stream/service...

The firehose also looks like a great feed to add to my other internal
route monitoring things:
  1) get bgp data from my firewall's upstream devices
  2) get bgp from my internal network
  3) eat bmp from my PE/CE device set
  4) add rislive-firehose
  5) add routeviews/ris update data when available (poll each 15m min,
process mrt && ingest data)

determine what patterns/filters/thigns I want to monitor: "did prefixX
just change upstream ASN and I should bias traffic differently toward
that prefix?" etc...

> I hadn't seen the ripe folk pipe up anywhere with what their SLO/etc
> is for the ris-live service? (except their quip about: "used to run in
> a tmux session I had to occassioanlly ssh into <foo> and restart when
> <foo> rebooted" I believe the end of that quip in Iceland was: "and
> now its' running as a real service")

It's in between those. We now have a conscious setup which should also
be able to scale up, but bits and pieces (like full monitoring of the
service) are still being developed.

ok cool! as with my question to John Curran about ARIN service SLOs
I'm really asking:
  "Hey, if I inputting this data into my business process I want to
know what to expect from a performance/scalability/outage/reliability
perspective"

if that's not written down and published then some folks MAY chose to
believe: "Well, it's available now, and now and now.. so 'always,
100%!!' seems sane!"
or others may choose to believe; "Well, nice toy you have there... let
me know when it's ready for me to ingest into my production
monitoring/etc systems" <toddle off to the corner to play ball with
cartman...>

> Also, one of the strengths to the 'monitoring as a service' folks is
> their number of collection points and breadth of ASN to which they
> interconnect those points/ RISLive, I think, reports out from ~37 or
> so RIPE probes, how do we (the internet) get more deployed (or better
> interconnection to the current sets)? and maybe even more
> imoprtantly... what's the right spread/location/interconnectivity map
> for these probes?

RIS Live provides data from RIS, which has a bunch of collectors around
the world (see
https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-peering-policy)
with many hundreds of peering sessions. But it is by no means complete
in terms of coverage.

If and how the community (NANOG or RIPE or else) should work on optimal
data collection is indeed a useful discussion to have.

ok, cool! :slight_smile: