Table of Contents
Fetching ...

Understanding Routing-Induced Censorship Changes Globally

Abhishek Bhaskar, Paul Pearce

TL;DR

This work addresses the variability in global outside-in censorship measurements by analyzing how Equal-cost Multi-path (ECMP) routing and Flow-ID-related source parameters affect observed censorship across DNS, HTTP, and HTTPS. It introduces Monocle, a route-stable measurement platform, and shows that ECMP-induced changes occur in 17 of 21 countries, affecting a substantial fraction of IPs and ASes and varying by protocol. The study reveals multiple routing-infrastructure causes for observed differences and contextualizes prior censorship research within this routing-aware framework, providing guidance to improve repeatability and interpretation. Practically, the findings urge cross-protocol measurements with diverse source parameters and repeated trials to obtain robust assessments of Internet censorship and to better understand prior study results.

Abstract

Internet censorship is pervasive, with significant effort dedicated to understanding what is censored, and where. Prior censorship work however have identified significant inconsistencies in their results; experiments show unexplained non-determinism thought to be caused by censor load, end-host geographic diversity, or incomplete censorship -- inconsistencies which impede reliable, repeatable and correct understanding of global censorship. In this work we investigate the extent to which Equal-cost Multi-path (ECMP) routing is the cause for these inconsistencies, developing methods to measure and compensate for them. We find ECMP routing significantly changes observed censorship across protocols, censor mechanisms, and in 17 countries. We identify that previously observed non-determinism or regional variations are attributable to measurements between fixed end-hosts taking different routes based on Flow-ID; i.e., choice of intra-subnet source IP or ephemeral source port leads to differences in observed censorship. To achieve this we develop new route-stable censorship measurement methods that allow consistent measurement of DNS, HTTP, and HTTPS censorship. We find ECMP routing yields censorship changes across 42% of IPs and 51% of ASes, but that impact is not uniform. We identify numerous causes of the behavior, ranging from likely failed infrastructure, to routes to the same end-host taking geographically diverse paths which experience differences in censorship en-route. Finally, we explore our results in the context of prior global measurement studies, exploring first the applicability of our findings to prior observed variations, and then demonstrating how specific experiments from two studies could be impacted by, and specific results are explainable by, ECMP routing. Our work points to methods for improving future studies, reducing inconsistencies and increasing repeatability.

Understanding Routing-Induced Censorship Changes Globally

TL;DR

This work addresses the variability in global outside-in censorship measurements by analyzing how Equal-cost Multi-path (ECMP) routing and Flow-ID-related source parameters affect observed censorship across DNS, HTTP, and HTTPS. It introduces Monocle, a route-stable measurement platform, and shows that ECMP-induced changes occur in 17 of 21 countries, affecting a substantial fraction of IPs and ASes and varying by protocol. The study reveals multiple routing-infrastructure causes for observed differences and contextualizes prior censorship research within this routing-aware framework, providing guidance to improve repeatability and interpretation. Practically, the findings urge cross-protocol measurements with diverse source parameters and repeated trials to obtain robust assessments of Internet censorship and to better understand prior study results.

Abstract

Internet censorship is pervasive, with significant effort dedicated to understanding what is censored, and where. Prior censorship work however have identified significant inconsistencies in their results; experiments show unexplained non-determinism thought to be caused by censor load, end-host geographic diversity, or incomplete censorship -- inconsistencies which impede reliable, repeatable and correct understanding of global censorship. In this work we investigate the extent to which Equal-cost Multi-path (ECMP) routing is the cause for these inconsistencies, developing methods to measure and compensate for them. We find ECMP routing significantly changes observed censorship across protocols, censor mechanisms, and in 17 countries. We identify that previously observed non-determinism or regional variations are attributable to measurements between fixed end-hosts taking different routes based on Flow-ID; i.e., choice of intra-subnet source IP or ephemeral source port leads to differences in observed censorship. To achieve this we develop new route-stable censorship measurement methods that allow consistent measurement of DNS, HTTP, and HTTPS censorship. We find ECMP routing yields censorship changes across 42% of IPs and 51% of ASes, but that impact is not uniform. We identify numerous causes of the behavior, ranging from likely failed infrastructure, to routes to the same end-host taking geographically diverse paths which experience differences in censorship en-route. Finally, we explore our results in the context of prior global measurement studies, exploring first the applicability of our findings to prior observed variations, and then demonstrating how specific experiments from two studies could be impacted by, and specific results are explainable by, ECMP routing. Our work points to methods for improving future studies, reducing inconsistencies and increasing repeatability.
Paper Structure (30 sections, 10 figures, 2 tables)

This paper contains 30 sections, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Monocle, our system to understand impact of routing on censorship measurement across countries and protocols.
  • Figure 2: Normalized Distribution of Number Of Paths for all destinations. Marker size represents number of experiments across all destinations that had the particular Number of Paths. We observe that the variation in path has two observable modes, constant parameters vs varied parameters, and that source IP has a greater impact on network path variations than varying source port. We also observe different modes within a particular country (e.g., Iran) potentially indicating destination port-specific based load balancing.
  • Figure 3: CDF of percent of (source IP, source Port) combinations per destination for which we observed no censorship. Measurements are limited to the subset of destinations where variation is observed in Table \ref{['tab:rq2']}. We observe that 1) the percent of source parameters that produce no censorship for the median affected destination varies significant by country and protocol. 2) HTTP and HTTPS follow very similar trends in how source parameters affect their results in some countries (BD, BY, IN, & ID) while in others, they vary significantly (RU, KW) and 3) DNS is affected least by source parameters.
  • Figure 4: Influence of Source IPs and/or Ports on Changes in Censorship. Sources and destinations are sampled uniformly across the lowest 3 bits, and colored based on those bits for either source or destination IPs. X-Axis sorted.
  • Figure 5: Network graphs of censored routes (left) and non-censored routes (right). All nodes are shown in both, only routes change. Blue nodes were found only in paths that caused censorship, black found only in paths that had no censorship, yellow found in both, and censoring edges (if found) are red. We observe notable structural differences between cases.
  • ...and 5 more figures