Reverse Traceroute

Rolf_Winter · February 22, 2023, 12:41pm

Dear NANOG folks,

As you know, traceroute is unable to enumerate routers on the reverse path. Given that paths through the public internet are usually asymmetric, knowing the reverse path would be beneficial e.g. for troubleshooting purposes (Troubleshooting with Traceroute - YouTube).

We have implemented a reverse traceroute tool (GitHub - HSAnet/reverse-traceroute: An implementation of reverse traceroute), both client and server for both IPv4 and IPv6. We are also in the process of specifying the protocol at the IETF (draft-heiwin-intarea-reverse-traceroute-01).

We also gave a talk on reverse traceroute at DENOG14 (DENOG14 There AND back - designing reverse traceroute - YouTube).

If you would like to play with reverse traceroute, the easiest option is to work with the client and use one of the public server instances (reverse-traceroute/ENDPOINTS at main · HSAnet/reverse-traceroute · GitHub). If you would be willing to host a public server instance yourself, please reach out to us. Also, if you find this work useful, please start discussing the work at the IntArea WG at the IETF.

If you have any questions or comments, just drop us a line, file an issue on github and/or use the IntArea mailing list.

Thanks a bunch,

Rolf

Christopher_Morrow · February 22, 2023, 5:19pm

Didn't ethan's project:
https://www.measurementlab.net/publications/reverse-traceroute.pdf

end with usable code/etc?

Rolf_Winter · February 22, 2023, 6:19pm

Hi Christoper,

I cannot/shouldn't really answer, since this is somebody else's work. The latest publication on that body of work can be found here:

Published at IMC 22 last October.

I believe a demo is actually online here: https://revtr.ccs.neu.edu/

That piece of work and ours differ in a number of ways. Whereas the work you cite is an external system really, that let's you perform a reverse traceroute through said system, we have implemented something, that works just like traceroute does today, but for the reverse direction. I.e. it works from your terminal, performing a traceroute back to you. The system you mention has an accuracy of about 92% at the AS-level. Since we perform the actual measurement between two endpoints we identify the actual forwarding path, at the router-level, including load-balanced paths.

But we use ICMP and would need code points to move forward. So if you find this useful, discussion on the IntArea mailing list would be appreciated.

Best,

Rolf

Tore_Anderson1 · February 25, 2023, 10:09am

* Rolf Winter

If you would like to play with reverse traceroute, the easiest option
is to work with the client and use one of the public server instances
(reverse-traceroute/ENDPOINTS at main · HSAnet/reverse-traceroute · GitHub).
If you would be willing to host a public server instance yourself,
please reach out to us.

I suggest you get in touch with the fine folks at NLNOG RING and ask it
they would be interested in setting this up on the 600+ RING nodes all
over the world. See https://ring.nlnog.net/.

Tore

Rolf_Winter · February 25, 2023, 1:37pm

Hi Tore,

thanks for the suggestion. We are already in touch with the NLNOG Ring folks. They are really helpful! But, the more the better.

Also, for people playing with the client, it would be helpful to us if you use the --transmit command line switch. This will send information about the traceroute operation to us for further analysis.

Additionally, the endpoint "playground.net...." is currently used for some variations of reverse traceroute, so some measurements might not work currently. You can just use any of the other endpoints.

Best,

Rolf

Hugo_Slabbert1 · February 25, 2023, 7:19pm

Is there a possible reflection & amplification vector here?

The client sends a reverse traceroute request to the server. This has a 12-byte ICMP header as indicated in 3.1
The server responds to the client with a traceroute response. This has a 12-byte ICMP header as indicated in 3.2, but also a traceroute payload of 24 bytes as indicated in 3.3

So the total response from client to server has at least +24 bytes beyond the original client request? And a spoofed source address on a reverse traceroute request would then direct the reverse traceroute response to the spoofed victim?

+24 bytes is not a huge amount in terms of amplification, but if this is accurate, is that perhaps worth calling out in the security considerations?

Actually: Would there not also be a slight additional bit of traffic to the spoofed address, in that the actual traceroute probe itself, that is sent from the reverse traceroute server, is also directed towards the spoofed source IP address? The last probe in the series, that has a TTL equal to the distance between the reverse traceroute server and the probe target, would reach the target, but additional probes (with TTL shorter than the distance from server to target) would still be flung from the server across intermediate hops.

E.g. if I spoof a client address that is 15 hops away from the reverse traceroute server, then my single reverse traceroute request would result in:

15 probes initiated from the reverse traceroute server toward the spoofed target (with each probe progressing one hop closer to the target)
one reverse traceroute response that is +24 bytes from my original request, also directed toward the spoofed target

Am I understanding the structure correctly there?

Hugo_Slabbert1 · February 25, 2023, 8:00pm

Ah, apologies, I misunderstood:

One reverse traceroute request => one probe + one reverse traceroute response.

So it is slightly additive, but does not multiply out to the distance between the reverse traceroute server and the target.

Rolf_Winter · February 26, 2023, 9:38am

Hi Hugo,

correct. It is not so bad. But you are still raising a valid point and we have been pondering over this and indeed this is one of the reasons why we have posted our work here.

I think, if you want to mount an amplification attack, you would be way better off using DNS A response and probe, as you said, is a little more than a request in terms of bytes on the wire. We could easily specify payload to be added to the request so that the request and the respective response and probe as equivalent in size. Would you, or anybody on this list be worried about amplification given that it only a little bit more? This would be really interesting input to us. Also, I think it would be worth while discussing over at the IETF.

Just as some additional information, we expect people to rate limit reverse traceroute, which our implementation already allows.

Best,

Rolf

Mailman · February 27, 2023, 12:35am

Similarly you might reach out to RIPE and inquire if they are interested in adding this functionality to their Atlas Probes et al.

Ethan_Katz-Bassett · February 27, 2023, 4:51am

Chris, thanks for mentioning me/our project! Rolf, thanks for pointing to our recent 2nd reverse traceroute paper!

Our recent paper addressed what we saw as the major limitations of my original 1st reverse traceroute paper that Chris linked (accuracy and scalability). We intend our system to be an open tool for the community, and we are currently testing it with outside users. It can potentially measure paths to you from any responsive host on the Internet, without requiring access/changes/new support at the host or routers along the path (more details below). If you want to try out our tool during this testing phase, please email us at revtr@ccs.neu.edu

I’ll describe our project a bit, including some of the similarities and differences between our project and Rolf’s. I’ll call ours revtr-2.0, since that’s what we call it in the paper, and I’ll call Rolf’s revtr-lg, since it is somewhat akin to a Looking Glass server.

The goal in both is the same: the user u wants to measure the path (IP addresses of routers, RTTs per hop) from a remote host h to u, without direct control of h.

revtr-2.0’s approach relies on the rarely used (but actually widely supported) IP Record Route option, coupled with some measurement tricks. In my understanding, revtr-lg proposes to add a new ICMP type.

Both approaches rely on a set of distributed vantage points running an implementation of their particular reverse traceroute code, but the way they use the vantage points is very different, and that leads to the main differences between the approaches. Our current tool has vantage points at 150 sites around the world, so I’ll use that number as an example for discussion.

COVERAGE

revtr-lg allows a user u to contact a vantage point to request a traceroute from the vantage point to u, so it measures from sites that have opted to run the revtr-lg software. So, with 150 sites, revtr-lg would be able to measure 150 paths to u.
revtr-2.0 uses the vantage points to issue various measurements that combine to measure the route to u from any host h the user requests – h need not be part of the system and need not run any special software. We were able to use revtr-2.0 and its 150 vantage points to measure paths from hosts in 39,544 ASes. (According to APNIC estimates, these ASes host 92.6% of Internet users).
If you want to use revtr-2.0 to measure paths to you (from whichever hosts you request), you need to run our client code on a public IP address. It will contact our system and use our vantage points to measure routes to you. If you want to try out our tool, please email us at revtr@ccs.neu.edu
Our vantage points currently support running a few tens of millions of reverse traceroutes per day. We hope to improve the scalability/throughput going forward.

Just as some hosts are configured not to respond to ping, some hosts do not respond to our measurements. We found that 75% of hosts that respond to ping also respond to our measurements. Of responsive hosts, 63% are within range of our current vantage points. Going forward, we would be able to measure from more than the 39,544 ASes if we add more vantage points in strategic locations where we currently lack coverage and/or (perhaps if our system gains traction) if more operators configure their routers to respond.

ACCURACY

I think both approaches are similarly accurate. Rolf mentions that revtr-lg “perform[s] the actual measurement between two endpoints, we identify the actual forwarding path, at the router-level, including load-balanced paths.” revtr-2.0 also measures the actual forwarding path between the two endpoints, at the router/IP-level, including the ability to uncover the multiple branches of load balanced paths.

In only 1.5% of cases, revtr-2.0 returned a path that did not agree with a normal traceroute issued from the remote host (in a controlled experiment where we had access to the remote host but did not give revtr-2.0 access). It could be a path change between the two measurements, or an anomaly that impacted either traditional traceroute or our tool. So in the other 98.5% of cases, our tool was accurate. Rolf mentioned that revtr-2.0 has an accuracy of 92% at the AS-level. What we meant by that is that 92.3% of our measurements had all of the ASes on them that a traceroute issued from the remote host had In an additional 6.1% of cases, revtr-2.0 missed a single AS that was unresponsive – like a “*” in traditional traceroute. In only the remaining 1.5% of cases did the two measurements actually have discrepancies.

We expect that our tool is similarly accurate at the IP and router-level, it’s just harder to give exact numbers because tools can return different IP addresses that correspond to the same router, and so we did the comparison at AS level.

Best,

Ethan (and Kevin Vermeulen, Dave Choffnes, and Italo Cunha)

Rolf_Winter · February 27, 2023, 7:47am

Before "revtr-lg" sticks, in the grand tradition of Paris Traceroute and Tokyo Ping we call our version of traceroute "Augsburg Traceroute". And it does not resemble a looking glass server. There are two parties involved in a reverse traceroute operation (not counting the routers that reply to an expired packet with an ICMP Time Exceeded) and that are the two hosts at the end of the path in question. Just like ping or traceroute today. There is no external system or particular server instance required. What it means though is that, if we want to have this kind of functionality for the public internet, we need to go through the IETF standardization process (or have the server side run at the application layer, which we would like to avoid). That might take some time and getting this into operating system code might take some time, too, but we believe it is worth doing.

The reason why we would like to have this on NLNOG Ring is not, because we need the servers to make the traceroute result better, but to give people a lot more choice for public instances against which they can run a reverse traceroute. If you have infrastructure and would like to make use of reverse traceroute today, you can. You can just use our code. For the server, the only hard requirement is a Linux kernel version 5.15 or above.

So from my perspective, the main differences are:

1. Who: An external system that attempts to measure a path for you (revtr-2.0) or two hosts performing a traceroute between each other (Augsburg Traceroute).

2. How: Used techniques such as source address spoofing and record route options (amongst others, revtr-2.0) vs. a new ICMP message to trigger a traceroute probe and convey the results (Augsburg Traceroute).

3. What: Can create a map of the paths through the internet and do on-demand traceroutes (revtr-2.0) is only useful between a host you control (issuing the traceroute) and a host on the internet willing to perform the operation (Augsburg Traceroute).

A lot of other differences are a result of the above such as overhead, accuracy, policability, deployability etc.

Just as a final remark, the reason why I said that we can accurately measure the router-level path (including load-balanced paths) is that we basically perform a form of Paris Traceroute between two hosts from those actual hosts at the end of the path in question.

Best,

Rolf

Rolf_Winter · February 27, 2023, 8:13am

RIPE Atlas is a bit "different" in that you need credits to trigger something on Atlas. And Atlas already implements traceroute, incl. Paris Traceroute. That means, in fact (if you have credits) you can already reverse traceroute from an Atlas Probe to yourself (and other places on the internet).

But, you are raising in interesting point, which we have thought about but dismissed. But feedback from the operational community on this would be valuable. Our reverse traceroute currently restricts the server to trace back to the issuing client. We did this for security reasons. The question was "why should anybody on the internet be able to do a traceroute from my server to a destination of choice?". Lifting this restriction would allow a functionality similar to "Is it down? Check at Down for Everyone or Just Me. But, somebody might use your server for this. How do people feel about this? Restrict the reverse traceroute operation to be done back to the source or allow it more freely to go anywhere?

Best,

Rolf

Saku_Ytti1 · February 27, 2023, 1:36pm

What are the pros and cons of this? Let's call it destination TLV.

If I am someone who wants to do volumetric attack, I won't set any
destination TLV, because without destination TLV and by spoofing my
source, I get more leverage. If my source and destination TLV differ,
then I have less leverage. So in this sense, it adds no security
implications, but adds a massive amount of diagnostic power, as one
very common request is to ask traceroute between nodes you have no
access to.

What it would allow is port knocking the ports used through proxy, if
this matters or not might be debatable.

Perhaps the standard should consider some abilities to be default on,
and others default off, and let the operator decide if they want to
turn some default off abilities on, such as honoring destination TLV.

Mailman · February 27, 2023, 8:29pm

But feedback from the operational community on this would be valuable. Our reverse traceroute currently restricts the server to trace back to the issuing client. We did this for security reasons.

I understand the motivation for your team's caution / security posture.

The question was "why should anybody on the internet be able to do a traceroute from my server to a destination of choice?".

How many times have we been out and about in our daily lives and received a text / phone call that prompted us to initiate diagnostic between two locations other than where we were at or where our traffic appeared to originate from?

Lifting this restriction would allow a functionality similar to "https://downforeveryoneorjustme.com/"\. But, somebody might use your server for this. How do people feel about this? Restrict the reverse traceroute operation to be done back to the source or allow it more freely to go anywhere?

I'm already trusting the RIPE team and their security measures for the Atlas probe that's in my network. I'm okay continuing to rely on them to monitor and react to this if it becomes a problem.

Perhaps the RIPE team could make a test to an arbitrary destination (considerably ~> 10 x) more expensive (in credits) than to the destination that you're initiating from.

Just my 2¢.

Hugo_Slabbert1 · February 28, 2023, 7:05pm

It is not so bad. But you are still raising a valid point and we have been pondering over this and indeed this is one of the reasons why we have posted our work here.

I think, if you want to mount an amplification attack, you would be way
better off using DNS

Agreed; it’s a very low level of amplification, and there are lower hanging fruit. I do think it’s worthwhile to still indicate, though, in a similar vein overall in the security considerations re: rate limiting server side.

We could easily specify payload to be added to the request so that the request and the respective response and probe as equivalent in size.

I’m still a fan of the notion that, for a connectionless protocol, the response size should not exceed the request size, to eliminate that risk entirely. I think that does become a bit much in this case, though, if we have to factor in the spoofing scenario, as you have:

client: traceroute request → server
server: probe → client
client: probe response → server (may be missing)
server: traceroute response → client

The client gets both the traceroute response (4) as well as the traceroute probe (2). If padding were to be required in the traceroute request, it would need to account for both the +24 bytes delta between the traceroute request and response probes, as well as for the actual probe size, including its L3 headers. imho that starts to balloon the traceroute request a good bit, and also feels clunky that we’re now coupling things such that the traceroute request also needs to calculate or be aware of the size of the actual probe in order to craft its traceroute request.

Also, I think it would be worth while discussing over at the IETF.

Sure thing. Should I try to piggyback on an existing thread in Int-area or start a new thread there?

Rolf_Winter · March 1, 2023, 8:19am

> It is not so bad. But you are still raising a valid point and we have been pondering over this and indeed this is one of the reasons why we have posted our work here.
>
> I think, if you want to mount an amplification attack, you would be way
better off using DNS

Agreed; it's a very low level of amplification, and there are lower hanging fruit. I do think it's worthwhile to still indicate, though, in a similar vein overall in the security considerations re: rate limiting server side.

We have that, plus a few other things in the security consideration section. But we probably should quantify the level of amplification, too.

> We could easily specify payload to be added to the request so that the request and the respective response and probe as equivalent in size.

I'm still a fan of the notion that, for a connectionless protocol, the response size should not exceed the request size, to eliminate that risk entirely. I think that does become a bit much in this case, though, if we have to factor in the spoofing scenario, as you have:

1. client: traceroute request -> server
2. server: probe -> client
3. client: probe response -> server (may be missing)
4. server: traceroute response -> client

The client gets both the traceroute response (4) as well as the traceroute probe (2). If padding were to be required in the traceroute request, it would need to account for *both* the +24 bytes delta between the traceroute request and response probes, as well as for the actual probe size, including its L3 headers. imho that starts to balloon the traceroute request a good bit, and also feels clunky that we're now coupling things such that the traceroute request also needs to calculate or be aware of the size of the actual probe in order to craft its traceroute request.

Yes, that is the dilemma. To make things a bit more complicated, RFC 792 (the original ICMP RFC) specifies, that the Time Exceeded Message contains the original IP header plus next 64 bits following it. On today's internet, there actually might be a few more bits. One could make a conservative estimate and use that.

I also got off-list feedback that basicaclly said, if you don't add this now, and people find a way to exploit it later, operators will block it and once it is blocked, it won't be unblocked again".

> Also, I think it would be worth while discussing over at the IETF.

Sure thing. Should I try to piggyback on an existing thread in Int-area or start a new thread there?

Please start a new thread. Thanks!

Rolf