Why do ROV-ASes announce some invalid route?

We learned from Cloudflare’s https://isbgpsafeyet.com/ that some ASes have deployed RPKI Origin Validation (ROV). However, we downloaded BGP collection data from RouteViews and RipeRis platforms and found that some ROV-ASes can announce some invalid routes. For example, from RIB data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to deploy ROV announced invalid routes, and we list the number of related prefixes for each AS below.

ASN 3356 1299 174 2914 6939 3257 6453 3491 9002 5511 7922 13335 16509
pref# 7 23 31 4 361 15 273 16 2 56 17 10 5

As a comparison, we count the invalid routes the non-ROV ASes (also declared in https://isbgpsafeyet.com/) announces, as below:

ASN 6762 6461 1273 12956 12389 20485 701 7473 9009
pref# 597 603 587 11 161 162 559 492 380

We can see that ROV ASes announced apparently fewer invalid routes compared to the non-ROV ASes, though they did not filter all the invalids.
AS6939 announced apparently more invalid routes compared with other ROV-ASes. We learned from the discussions two years ago (Reactive RPKI ROV (Was: Hurricane Electric has reached 0 RPKI INVALIDs)) that AS6939 uses reactive ROV. I.e., route collectors identify invalid routes, write them into scripts and send to routers, who then send “withdrawals” of the invalids based on the scripts.
However, for the BGP collection time 2022-10-31 00:00:00, we downloaded the two-hour updates afterwards, and found very few withdrawals from AS6939 about those invalid routes in the first hour. In the second hour, AS6939 withdraws hundreds of invalid prefixes, but most of these withdraws are followed by another invalid announcement with the same prefix and same invalid origin AS.

Can anyone help us to correctly interpret this case? Thank you very much.

Dear 孙乐童,

We learned from Cloudflare's https://isbgpsafeyet.com/ that some ASes
have deployed RPKI Origin Validation (ROV). However, we downloaded BGP
collection data from RouteViews and RipeRis platforms and found that
some ROV-ASes can announce some invalid routes. For example, from RIB
data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to
deploy ROV announced invalid routes, and we list the number of related
prefixes for each AS below.

[snip]

As a comparison, we count the invalid routes the non-ROV ASes (also
declared in https://isbgpsafeyet.com/) announces, as below:

We can see that ROV ASes announced apparently fewer invalid routes
compared to the non-ROV ASes, though they did not filter all the
invalids.

[snip]

Can anyone help us to correctly interpret this case? Thank you very much.

You ask great questions! I hope an answer to your questions can be found
in a message I sent a year ago:

  Cogent RPKI invalid filtering

The summary: in any sufficiently large network, chances are not 100% of
all equipment supports RPKI-based BGP Route Origin Validation; in such
cases a handful of invalid routes may still percolate through the
system. Another contributing factor might be certain types of software
upgrades; where ROV temporarily is disabled on one or more devices. Or
perhaps an ISP made a handful of exceptions for test/beacon invalid
routes to propagate.

Kind regards,

Job

aside from technical reasons for an ROV-supporting AS (RAS) to announce
an ROV invalid prefix, there is an administrative one. the RAS's
customers *pay* RAS to announce the customers' prefixes. so RAS is
configured to propagate their customers' announcements without dropping
invalids.

randy

Hello Job,
  Thank you very much for your reply! I got that no AS can actually filter all the invalids. Yet I was trying to figure out why we couldn't see reasonable amount of withdrawals from AS6939 about invalid prefixes, as they explained how they implement ROV (Reactive RPKI ROV (Was: Hurricane Electric has reached 0 RPKI INVALIDs)). Perhaps we need to learn their detailed implementations.
  Thank you very much!

Best wishes,
Sun Letong

在2022-11-08 00:11:24,Job Snijders<job@fastly.com>写道:

<note I didn't look at the RV data for this>

There are 2 sides to the bgp conversation for any ASN, and then really 4 sides.
  customer -> RAS -> peer (settlement-free)
  peer(sfp) -> RAS -> customer
  customer -> ras -> transit
  transit -> ras -> customer

Depending on the RAS's capabilities or status in their journey to
'fully RAS', it's
possible that they may have:
  o "We OV all customer sessions" (notably not SFP peers)
  o "We OV all sessions(*)" (noting not all, and maybe depending on
platform specifics)

There are a bunch of ways this goes wrong :frowning: This also doesn't really
tell what sort of peering
the RAS has set up with RouteViews (customer? peer? partial peer?)

Also, also, possibly the output path on the session(s) here is not
filtering in an OV fashion.

ROV belongs on the input path, let's not ROV on the output towards
customers / route collectors.

Announcing bigger, ROV valid/unkown aggregates, while really routing
based on possibly ROV-invalid more specifics in the FIB is akin to
actively obscuring routing security, "cheating" your way to a RAS.

Yes, there are some very specific situations where output ROV is
beneficial (a peering box not supporting ROV and you ask your peer to
ROV their output), but let's not normalize ROV on the output path.

Thanks,
Lukas

ROV belongs on the input path, let's not ROV on the output towards
customers / route collectors.

8893

randy

FYI, Huawei routers support Egress ROV.

sure. This assumes a 100% coverage for all inputs to the rib-out on
the customer port we're talking about, though.
If you don't have 100% coverage you'll end up with the leaks
seen/reported by the OP.

I don't mean to say/imply:
  "Hey, everyone(anyone) should do OV on output"

I mean to say that:
  "Hey, if you see OV failures leaking, this is probably a side effect
of the behavior/design
   choices a network made." (not doing OV filtering on one of
peer/customer/transit type
   peerings."

-chris