Towards Auto-Modeling of Formal Verification for NextG Protocols: A Multimodal cross- and self-attention Large Language Model Approach
Jingda Yang, Ying Wang
TL;DR
This work tackles the scalability and trustworthiness challenges of formal verification for NextG protocols by introducing AVRE, a system that combines a cross-attention–based LLM (CAL) with real-world prompting from HyFuzz to translate protocol descriptions into dependency graphs and formal models. CAL leverages multimodal cross- and self-attention to identify and quantify dependencies among protocol identifiers and properties, under a weight-balanced loss to cope with imbalanced data. The approach is validated on 3GPP Rel-17 RRC protocols, achieving 95.94% accuracy and 0.98 AUC, and is demonstrated to produce design-intent and trustworthy enhancements via real-world testbeds, enabling subsequent formal verification steps (e.g., Proverif) with reduced manual labeling. The results indicate that integrating learning-based interpretation with empirical prompting can significantly advance scalable, trustworthy verification and vulnerability discovery for complex communication systems.
Abstract
This paper introduces Auto-modeling of Formal Verification with Real-world Prompting for 5G and NextG protocols (AVRE), a novel system designed for the formal verification of Next Generation (NextG) communication protocols, addressing the increasing complexity and scalability challenges in network protocol design and verification. Utilizing Large Language Models (LLMs), AVRE transforms protocol descriptions into dependency graphs and formal models, efficiently resolving ambiguities and capturing design intent. The system integrates a transformer model with LLMs to autonomously establish quantifiable dependency relationships through cross- and self-attention mechanisms. Enhanced by iterative feedback from the HyFuzz experimental platform, AVRE significantly advances the accuracy and relevance of formal verification in complex communication protocols, offering a groundbreaking approach to validating sophisticated communication systems. We compare CAL's performance with state-of-the-art LLM-based models and traditional time sequence models, demonstrating its superiority in accuracy and robustness, achieving an accuracy of 95.94\% and an AUC of 0.98. This NLP-based approach enables, for the first time, the creation of exploits directly from design documents, making remarkable progress in scalable system verification and validation.
