Table of Contents
Fetching ...

CensorLab: A Testbed for Censorship Experimentation

Jade Sheffey, Amir Houmansadr

TL;DR

Censorship evolves rapidly, making reactive circumvention research potentially brittle and risky. CensorLab provides a generic censorship-emulation framework that can model past, present, and futuristic censorship strategies in realistic, high-performance environments, enabling proactive development of anti-censorship techniques. It combines a multi-layer architecture, programmable censor programs (via PyCL and CensorLang), and ML model integration through ONNX to realize diverse blocking decisions from simple identifier-based filters to complex data-driven classifiers. By offering easy usability, low overhead, and safe lab-based testing, CensorLab aims to complement real-world censorship measurement and advance resilient circumvention strategies that anticipate future censorship capabilities.

Abstract

Censorship and censorship circumvention are closely connected, and each is constantly making decisions in reaction to the other. When censors deploy a new Internet censorship technique, the anti-censorship community scrambles to find and develop circumvention strategies against the censor's new strategy, i.e., by targeting and exploiting specific vulnerabilities in the new censorship mechanism. We believe that over-reliance on such a reactive approach to circumvention has given the censors the upper hand in the censorship arms race, becoming a key reason for the inefficacy of in-the-wild circumvention systems. Therefore, we argue for a proactive approach to censorship research: the anti-censorship community should be able to proactively develop circumvention mechanisms against hypothetical or futuristic censorship strategies. To facilitate proactive censorship research, we design and implement CensorLab, a generic platform for emulating Internet censorship scenarios. CensorLab aims to complement currently reactive circumvention research by efficiently emulating past, present, and hypothetical censorship strategies in realistic network environments. Specifically, CensorLab aims to (1) support all censorship mechanisms previously or currently deployed by real-world censors; (2) support the emulation of hypothetical (not-yet-deployed) censorship strategies including advanced data-driven censorship mechanisms (e.g., ML-based traffic classifiers); (3) provide an easy-to-use platform for researchers and practitioners enabling them to perform extensive experimentation; and (4) operate efficiently with minimal overhead. We have implemented CensorLab as a fully functional, flexible, and high-performance platform, and showcase how it can be used to emulate a wide range of censorship scenarios, from traditional IP blocking and keyword filtering to hypothetical ML-based censorship mechanisms.

CensorLab: A Testbed for Censorship Experimentation

TL;DR

Censorship evolves rapidly, making reactive circumvention research potentially brittle and risky. CensorLab provides a generic censorship-emulation framework that can model past, present, and futuristic censorship strategies in realistic, high-performance environments, enabling proactive development of anti-censorship techniques. It combines a multi-layer architecture, programmable censor programs (via PyCL and CensorLang), and ML model integration through ONNX to realize diverse blocking decisions from simple identifier-based filters to complex data-driven classifiers. By offering easy usability, low overhead, and safe lab-based testing, CensorLab aims to complement real-world censorship measurement and advance resilient circumvention strategies that anticipate future censorship capabilities.

Abstract

Censorship and censorship circumvention are closely connected, and each is constantly making decisions in reaction to the other. When censors deploy a new Internet censorship technique, the anti-censorship community scrambles to find and develop circumvention strategies against the censor's new strategy, i.e., by targeting and exploiting specific vulnerabilities in the new censorship mechanism. We believe that over-reliance on such a reactive approach to circumvention has given the censors the upper hand in the censorship arms race, becoming a key reason for the inefficacy of in-the-wild circumvention systems. Therefore, we argue for a proactive approach to censorship research: the anti-censorship community should be able to proactively develop circumvention mechanisms against hypothetical or futuristic censorship strategies. To facilitate proactive censorship research, we design and implement CensorLab, a generic platform for emulating Internet censorship scenarios. CensorLab aims to complement currently reactive circumvention research by efficiently emulating past, present, and hypothetical censorship strategies in realistic network environments. Specifically, CensorLab aims to (1) support all censorship mechanisms previously or currently deployed by real-world censors; (2) support the emulation of hypothetical (not-yet-deployed) censorship strategies including advanced data-driven censorship mechanisms (e.g., ML-based traffic classifiers); (3) provide an easy-to-use platform for researchers and practitioners enabling them to perform extensive experimentation; and (4) operate efficiently with minimal overhead. We have implemented CensorLab as a fully functional, flexible, and high-performance platform, and showcase how it can be used to emulate a wide range of censorship scenarios, from traditional IP blocking and keyword filtering to hypothetical ML-based censorship mechanisms.

Paper Structure

This paper contains 12 sections, 3 figures, 4 tables, 2 algorithms.

Figures (3)

  • Figure 1: The overall architecture of CensorLab. CensorLab is split into 4 main components: the IPC interface, which controls CensorLab at runtime, the per-layer packet processing, which handles parsing and identifier-based blocking, the per-connection environment, which performs the bulk of packet analysis, and the model store, which manages ML models. The details of the per-connection environment are explained further in Section \ref{['sec:censor_program']}.
  • Figure 2: In Tap mode, CensorLab intercepts packets using a netfilter queue firewall rule, and issues verdicts for each packet. CensorLab is also capable of interfacing with the output interface directly for scenarios not supported by netfilter queues, such as injecting multiple packets.
  • Figure 3: In Wire mode, CensorLab acts as an intermediary between two physical network interfaces. Packets received from the ingress interface, if accepted, are written to the egress interface and vice versa.