CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications
Mirza Masfiqur Rahman, Imtiaz Karim, Elisa Bertino
TL;DR
CellularLint addresses a critical gap in 4G/5G standardization by systematically detecting inconsistencies in NAS and security specifications using a semi-automatic, domain-adapted NLP framework. It segments long protocol documents into sub-events, learns from limited labeled data through a multi-phase, active-learning pipeline, and uses an ensemble of transformers (EnCell) guided by domain-specific labeling mapped to Natural Language Inference. The approach identifies 157 inconsistencies with 82.67% accuracy and validates their real-world impact via open-source implementations and commercial UEs, revealing security, privacy, Denial-of-Service, and interoperability risks. The work demonstrates a scalable path to improve clarity and reliability of cellular standards and provides a foundation for applying similar methods to other complex protocol documents.
Abstract
In recent years, there has been a growing focus on scrutinizing the security of cellular networks, often attributing security vulnerabilities to issues in the underlying protocol design descriptions. These protocol design specifications, typically extensive documents that are thousands of pages long, can harbor inaccuracies, underspecifications, implicit assumptions, and internal inconsistencies. In light of the evolving landscape, we introduce CellularLint--a semi-automatic framework for inconsistency detection within the standards of 4G and 5G, capitalizing on a suite of natural language processing techniques. Our proposed method uses a revamped few-shot learning mechanism on domain-adapted large language models. Pre-trained on a vast corpus of cellular network protocols, this method enables CellularLint to simultaneously detect inconsistencies at various levels of semantics and practical use cases. In doing so, CellularLint significantly advances the automated analysis of protocol specifications in a scalable fashion. In our investigation, we focused on the Non-Access Stratum (NAS) and the security specifications of 4G and 5G networks, ultimately uncovering 157 inconsistencies with 82.67% accuracy. After verification of these inconsistencies on open-source implementations and 17 commercial devices, we confirm that they indeed have a substantial impact on design decisions, potentially leading to concerns related to privacy, integrity, availability, and interoperability.
