Table of Contents
Fetching ...

A Survey of Internet Censorship and its Measurement: Methodology, Trends, and Challenges

Steffen Wendzel, Simon Volpert, Sebastian Zillien, Julia Lenz, Philip Rünz, Luca Caviglione

TL;DR

This survey comprehensively maps network-level Internet censorship and its measurement, extending prior work by incorporating modern protocols (e.g., IPv6, QUIC), censorship of circumvention tools, and links to information hiding. It presents a unified taxonomy of censor capabilities, catalogs measurement methodologies across IP, transport, and application layers, and evaluates available datasets and platforms. The paper also highlights evolving trends, including AI-driven analysis, domain fronting, and the growing importance of human and societal factors in censorship dynamics. By outlining technical challenges, datasets, and measurement limitations, it guides researchers toward more accurate, ethical, and interoperable censorship measurement practices with practical implications for researchers, policymakers, and measurement platforms.

Abstract

Internet censorship limits the access of nodes residing within a specific network environment to the public Internet, and vice versa. During the last decade, techniques for conducting Internet censorship have been developed further. Consequently, methodology for measuring Internet censorship had been improved as well. In this paper, we firstly provide a survey of network-level Internet censorship techniques. Secondly, we survey censorship measurement methodology. We further cover the censorship of circumvention tools and its measurement, as well as available datasets. In cases where it is beneficial, we bridge the terminology and taxonomy of Internet censorship with related domains, namely traffic obfuscation and information hiding. We further extend the technical perspective with recent trends and challenges, including human aspects of Internet censorship.

A Survey of Internet Censorship and its Measurement: Methodology, Trends, and Challenges

TL;DR

This survey comprehensively maps network-level Internet censorship and its measurement, extending prior work by incorporating modern protocols (e.g., IPv6, QUIC), censorship of circumvention tools, and links to information hiding. It presents a unified taxonomy of censor capabilities, catalogs measurement methodologies across IP, transport, and application layers, and evaluates available datasets and platforms. The paper also highlights evolving trends, including AI-driven analysis, domain fronting, and the growing importance of human and societal factors in censorship dynamics. By outlining technical challenges, datasets, and measurement limitations, it guides researchers toward more accurate, ethical, and interoperable censorship measurement practices with practical implications for researchers, policymakers, and measurement platforms.

Abstract

Internet censorship limits the access of nodes residing within a specific network environment to the public Internet, and vice versa. During the last decade, techniques for conducting Internet censorship have been developed further. Consequently, methodology for measuring Internet censorship had been improved as well. In this paper, we firstly provide a survey of network-level Internet censorship techniques. Secondly, we survey censorship measurement methodology. We further cover the censorship of circumvention tools and its measurement, as well as available datasets. In cases where it is beneficial, we bridge the terminology and taxonomy of Internet censorship with related domains, namely traffic obfuscation and information hiding. We further extend the technical perspective with recent trends and challenges, including human aspects of Internet censorship.

Paper Structure

This paper contains 55 sections, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Paper selection methodology
  • Figure 2: A censor's fundamental influence on communication between nodes
  • Figure 3: High-level censorship system taxonomy
  • Figure 4: Man-on-the-Side (MotS) attack. The client (i) initiates an HTTP request to the legitimate server. The censor intercepts the traffic and (ii) injects a TCP segment containing a redirect to a malicious server. The client's original request may reach the server, and the server will respond. However, the censor's response arrives earlier and is thus considered a duplicate by the client and discarded. The client then connects (iii) to the malicious server, which (iv) provides censored content and/or delivers malware.
  • Figure 5: Multi-stage censorship: The first stage censoring device (i) conducts a high-level censorship decision (D) to either allow, block or forward the flow to a second stage censoring device (ii) for in-depth analysis.
  • ...and 9 more figures