afrinic rpki issue

Possible TA malfunction or incomplete VRP file: 73.95% of the ROAs disappeared from afrinic

See more details about the event:

Hi Randy,
Thank you for sharing this information. Our team is investigating the alert.
Best regards,

Hi all,

It appears PacketVis correctly identified an issue.

AFRINIC's self-signed root AfriNIC.cer [1] points via its SIA to
'afrinic-ca.cer' [2] which in turn references a RPKI Manifest named
'K1eJenypZMPIt_e92qek2jSpj4A.mft'.

The K1eJenypZMPIt_e92qek2jSpj4A Manifest lists 499 Certificate
Authorities. This Manifest represents the demarcation point between
"Afrinic as root CA operator" and "Afrinic hosting rpki on behalf of its
members". In other words; this is an important top-level Manifest in the
critical path towards the ROAs of the Afrinic members.

There was a ~ 7 hour gap in the validity window of this Manifest and its
companion CRL (from 20221120T000311Z until 20221120T071514Z). The
serials 1E19 and 1E1A (respectively 12B2 and 12B3) are successive.

rpki.afrinic.net/repository/afrinic/K1eJenypZMPIt_e92qek2jSpj4A.crl
    CRL Serial Number: 1E19
    CRL valid since: Nov 18 00:03:11 2022 GMT
    CRL valid until: Nov 20 00:03:11 2022 GMT

    CRL Serial Number: 1E1A
    CRL valid since: Nov 20 07:15:12 2022 GMT
    CRL valid until: Nov 22 07:15:12 2022 GMT

rpki.afrinic.net/repository/afrinic/K1eJenypZMPIt_e92qek2jSpj4A.mft
    Manifest Number: 12B2
    Manifest valid since: Nov 18 00:03:13 2022 GMT
    Manifest valid until: Nov 20 00:03:13 2022 GMT

    Manifest Number: 12B3
    Manifest valid since: Nov 20 07:15:14 2022 GMT
    Manifest valid until: Nov 22 07:15:14 2022 GMT

(The above can be reconstructed using archives from http://www.rpkiviews.org)

The rcynic validator hosted at Afrinic also noticed a gap in objects:
https://validator.afrinic.net/rpki/rcynic/rpki.afrinic.net_week_svg.html

A possible recommendation might be to increase the validity window of
these two objects from a sliding 48-hour window to a 1 or 2 week window.
This way any stalling in the issuance process wouldn't case operational
issues on the weekend.

Kind regards,

Job

[1]: SKI EB:68:0F:38:F5:D6:C7:1B:B4:B1:06:B8:BD:06:58:50:12:DA:31:B6
[2]: SKI 2B:57:89:7A:7C:A9:64:C3:C8:B7:F7:BD:DA:A7:A4:DA:34:A9:8F:80

Hi Job,

Thank you for this good analysis and for sharing your findings.
The issue has since been fixed and the team will publish a post-mortem accordingly once we are done with making sure the issue will not reappear.
Your recommendation is well noted and I cc my colleague so that they can take that into consideration in our improvement roadmap.
Best regards,

Hi All,

Did this issue resurface some days ago...?
I had nearly 6000 ROAs on June 1st.
That went to ZERO on June 2nd.

I'm using routinator. Should i have changed something in my config to accomodate for some change?

Best Regards,
Carlos

Hi Carlos,
We currently have a degradation on our RPKI services. We had to disable the RRDP service request so it can fall back to RSYNC in the meantime that the team works on ways to optimize the availability of the service. However, this was prior to 1st of June. We will still investigate just to be on the safe side though so far everything looks good on our side.

For reference of the mentioned degradation, you can check the below link
https://status.afrinic.net/notices/dkpzrtgqzftlclyg-rrdp-service-degradation
Best regards,

Hi Carlos,

Because of the issues that AfriNIC is facing, they are forcing all traffic from HTTPS to rsync, so you should check if rsync can properly set up outbound connections from your machine. What’s the output you get when you rsync rsync://rpki.afrinic.net/repository/ ?

I do an interactive Routinator validation run with debug logging enabled, like so:

$ routinator -vv vrps -f summary

Then I see the following in the logs:

[WARN] RRDP https://rrdp.afrinic.net/notification.xml: Getting notification file failed with status 204 No Content
[INFO] RRDP https://rrdp.afrinic.net/notification.xml: Update failed and current copy is expired since 2023-05-30 10:43:44 UTC.
[INFO] RRDP https://rrdp.afrinic.net/notification.xml: Falling back to rsync.
[INFO] rsyncing from rsync://rpki.afrinic.net/repository/.

Then, rsyncing the contents works just fine; objects are fetched and validated. Some objects fail validation with "certificate is not yet valid.”, "certificate has been revoked.” and “Object not found.” but that appears unrelated to the connectivity issues they’re facing.

I end up with the following totals:

Summary at 2023-06-14 13:43:24.366013 UTC
afrinic: ROAs: 5756 verified;
            VRPs: 7121 verified, 6820 final;
    router certs: 0 verified;
     router keys: 0 verified, 0 final.
           ASPAs: 0 verified, 0 final.

If you want some logs to compare, you can have a look here:
https://routinator.do.nlnetlabs.nl/log

It all still works without any extra configuration in Routinator.

Cheers,

Alex

Hi Carlos,

Hi Alex, All,

Because of the issues that AfriNIC is facing, they are forcing all traffic from HTTPS to rsync, so you should check if rsync can properly set up outbound connections from your machine. What?s the output you get when you rsync rsync://rpki.afrinic.net/repository/ ?

drwxr-xr-x 4,096 2023/06/14 12:04:28 .
-rw-r--r-- 496 2020/04/08 19:58:03 AfriNIC-simple.tal
-rw-r--r-- 1,216 2020/03/30 13:00:32 AfriNIC.cer
drwxr-xr-x 4,096 2023/06/09 13:50:13 04E8B0D80F4D11E0B657D8931367AE7D
drwxr-xr-x 32,768 2023/06/14 12:04:28 afrinic
drwxr-xr-x 4,096 2023/06/14 01:05:30 apnic
drwxr-xr-x 8,192 2023/06/14 11:42:38 arin
drwxr-xr-x 120 2023/06/14 01:15:32 lacnic
drwxr-xr-x 16,384 2023/06/14 12:04:01 member_repository
drwxr-xr-x 4,096 2023/06/14 01:20:30 ripe

Seems to be working...

I do an interactive Routinator validation run with debug logging enabled, like so:

$ routinator -vv vrps -f summary

Then I see the following in the logs:

[WARN] RRDP https://rrdp.afrinic.net/notification.xml: Getting notification file failed with status 204 No Content
[INFO] RRDP https://rrdp.afrinic.net/notification.xml: Update failed and current copy is expired since 2023-05-30 10:43:44 UTC.
[INFO] RRDP https://rrdp.afrinic.net/notification.xml: Falling back to rsync.
[INFO] rsyncing from rsync://rpki.afrinic.net/repository/.

Found valid trust anchor https://rpki.afrinic.net/repository/AfriNIC.cer. Processing.
RRDP https://rrdp.afrinic.net/notification.xml: Updating server
RRDP https://rrdp.afrinic.net/notification.xml: malformed XML
rsync://rpki.afrinic.net/repository/04E8B0D80F4D11E0B657D8931367AE7D/62gPOPXWxxu0sQa4vQZYUBLaMbY.mft: failed to validate
CA for rsync://rpki.afrinic.net/repository/04E8B0D80F4D11E0B657D8931367AE7D/ rejected, resources marked as unsafe:
    0.0.0.0/0
    ::/0
    AS0-AS4294967295

Then, rsyncing the contents works just fine; objects are fetched and validated. Some objects fail validation with "certificate is not yet valid.?, "certificate has been revoked.? and ?Object not found.? but that appears unrelated to the connectivity issues they?re facing.

I end up with the following totals:

Summary at 2023-06-14 13:43:24.366013 UTC
afrinic: ROAs: 5756 verified;
           VRPs: 7121 verified, 6820 final;
   router certs: 0 verified;
    router keys: 0 verified, 0 final.
          ASPAs: 0 verified, 0 final.

Where do you see this?
Command output?

Summary at 2023-06-14 14:11:34.413948850 UTC
ripe: 39230 verified ROAs, 212122 verified VRPs, 6 unsafe VRPs, 212117 final VRPs.
apnic: 24878 verified ROAs, 111967 verified VRPs, 0 unsafe VRPs, 111699 final VRPs.
arin: 64077 verified ROAs, 79176 verified VRPs, 0 unsafe VRPs, 78064 final VRPs.
lacnic: 17966 verified ROAs, 32624 verified VRPs, 5 unsafe VRPs, 31033 final VRPs.
afrinic: 0 verified ROAs, 0 verified VRPs, 0 unsafe VRPs, 0 final VRPs.
total: 146151 verified ROAs, 435889 verified VRPs, 11 unsafe VRPs, 432913 final VRPs.

If you want some logs to compare, you can have a look here:
https://routinator.do.nlnetlabs.nl/log

Thanks.

It all still works without any extra configuration in Routinator.

Well, for me it's still not really working yet.... :slight_smile:

Thanks anyway.

Cheers,
Carlos

Greetings,

My issue seems to be solved.

It seems the Afrinic glitch is incompatible with the version of routinator i was using. So i updated to the last version (0.12.1), and now i can get Afrinic's ROAs again :slight_smile:

Thanks Alex and Cedrick!

Best Regards,
Carlos

Hi Carlos,

Happy to hear everything is working fine with the latest version of Routinator.

At lot of work has been put into making fetching and validating RPKI data more robust since the (over two year old) version of Routinator that you were running.

I want to make an important point for the entire NANOG community:

As developers and operators, we’re still learning a lot about RPKI as it grows and evolves in the real world. Maintainers of relying party software [1] are actively adapting and improving their software every day to accommodate this.

This is security software. Please keep it updated.

Cheers,

Alex

[1] Software Projects — RPKI documentation