Measuring Internet Routing from the Most Valuable Points
Thomas Alfroy, Thomas Holterbach, Thomas Krenc, KC Claffy, Cristel Pelsser
TL;DR
The paper addresses the challenge that data from thousands of BGP vantage points (VPs) in RIPE RIS and RouteViews grow quadratically in volume, making analysis costly and prone to redundancy. It introduces MVP, a general redundancy-aware framework that quantifies how similar VP observations are and selects a subset of VPs that minimizes redundancy while preserving analytical utility across multiple routing analyses. MVP demonstrates improved coverage and accuracy for four canonical tasks (AS relationship inference, AS rank, hijack detection, and routing detours) using the same data volume, and it shows substantial positive impact when re-applying to prior studies. Deployed as bgproutes.io, MVP offers a scalable, data-driven approach to VP selection that can reduce processing costs and enable broader, more reliable Internet routing measurements, with potential applicability to other data-collection ecosystems.
Abstract
While the increasing number of Vantage Points (VPs) in RIPE RIS and RouteViews improves our understanding of the Internet, the quadratically increasing volume of collected data poses a challenge to the scientific and operational use of the data. The design and implementation of BGP and BGP data collection systems lead to data archives with enormous redundancy, as there is substantial overlap in announced routes across many different VPs. Researchers thus often resort to arbitrary sampling of the data, which we demonstrate comes at a cost to the accuracy and coverage of previous works. The continued growth of the Internet, and of these collection systems, exacerbates this cost. The community needs a better approach to managing and using these data archives. We propose MVP, a system that scores VPs according to their level of redundancy with other VPs, allowing more informed sampling of these data archives. Our challenge is that the degree of redundancy between two updates depends on how we define redundancy, which in turn depends on the analysis objective. Our key contribution is a general framework and associated algorithms to assess redundancy between VP observations. We quantify the benefit of our approach for four canonical BGP routing analyses: AS relationship inference, AS rank computation, hijack detection, and routing detour detection. MVP improves the coverage or accuracy (or both) of all these analyses while processing the same volume of data.
