From Topology to Behavioral Semantics: Enhancing BGP Security by Understanding BGP's Language with LLMs
Heng Zhao, Ruoyu Wang, Tianhang Zheng, Qi Li, Bo Lv, Yuyi Wang, Wenliang Du
TL;DR
This paper addresses BGP security by moving beyond topology-centric ML to semantic AS embeddings derived from LLMs. It introduces BGPShield, combining a segment-wise LLM-based Semantic Encoder with a Lightweight Contrastive Dimensionality Reduction and AR-DTW–driven anomaly detection to capture Behavioral Portraits and Routing Policy Rationales. The approach enables rapid embedding of unseen ASes, robust anomaly detection with adaptive thresholds, and AS-attributed event aggregation, outperforming state-of-the-art methods across 16 real-world datasets and showing strong real-time scalability and generalization. The work demonstrates practical potential for scalable, explainable BGP security through semantic representations and dynamic, semantics-aware path analysis. Overall, BGPShield offers a principled, deployable framework for resilient inter-domain routing in the face of misconfigurations and hijacking attempts.
Abstract
The trust-based nature of Border Gateway Protocol (BGP) makes it vulnerable to disruptions like prefix hijacking and misconfigurations, threatening routing stability. Traditional detection relies on manual inspection with limited scalability. Machine/Deep Learning (M/DL) approaches automate detection but suffer from suboptimal precision, limited generalizability, and high retraining costs. This is because existing methods focus on topological structures rather than comprehensive semantic characteristics of Autonomous Systems (ASes), often misinterpreting functionally similar but topologically distant ASes. To address this, we propose BGPShield, an anomaly detection framework built on LLM embeddings that captures the Behavior Portrait and Routing Policy Rationale of each AS beyond topology, such as operational scale and global role. We propose a segment-wise aggregation scheme to transform AS descriptions into LLM representations without information loss, and a lightweight contrastive reduction network to compress them into a semantic-consistent version. Using these representations, our AR-DTW algorithm aligns and accumulates semantic distances to reveal behavioral inconsistencies. Evaluated on 16 real-world datasets, BGPShield detects 100% of verified anomalies with a false discovery rate below 5%. Notably, the employed LLMs were released prior to evaluation events, verifying generalizability. Furthermore, BGPShield constructs representations for unseen ASes within one second, significantly outperforming BEAM which demands costly retraining (averaging 65 hours).
