HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Idan Habler; Vineeth Sai Narajala; Stav Koren; Amy Chang; Tiffany Saade

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Idan Habler, Vineeth Sai Narajala, Stav Koren, Amy Chang, Tiffany Saade

Abstract

Retrieval-Augmented Generation (RAG) systems are essential to contemporary AI applications, allowing large language models to obtain external knowledge via vector similarity search. Nevertheless, these systems encounter a significant security flaw: hubness - items that frequently appear in the top-k retrieval results for a disproportionately high number of varied queries. These hubs can be exploited to introduce harmful content, alter search rankings, bypass content filtering, and decrease system performance. We introduce hubscan, an open-source security scanner that evaluates vector indices and embeddings to identify hubs in RAG systems. Hubscan presents a multi-detector architecture that integrates: (1) robust statistical hubness detection utilizing median/MAD-based z-scores, (2) cluster spread analysis to assess cross-cluster retrieval patterns, (3) stability testing under query perturbations, and (4) domain-aware and modality-aware detection for category-specific and cross-modal attacks. Our solution accommodates several vector databases (FAISS, Pinecone, Qdrant, Weaviate) and offers versatile retrieval techniques, including vector similarity, hybrid search, and lexical matching with reranking capabilities. We evaluate hubscan on Food-101, MS-COCO, and FiQA adversarial hubness benchmarks constructed using state-of-the-art gradient-optimized and centroid-based hub generation methods. hubscan achieves 90% recall at a 0.2% alert budget and 100% recall at 0.4%, with adversarial hubs ranking above the 99.8th percentile. Domain-scoped scanning recovers 100% of targeted attacks that evade global detection. Production validation on 1M real web documents from MS MARCO demonstrates significant score separation between clean documents and adversarial content. Our work provides a practical, extensible framework for detecting hubness threats in production RAG systems.

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Abstract

Paper Structure (49 sections, 3 equations, 5 figures, 4 tables, 3 algorithms)

This paper contains 49 sections, 3 equations, 5 figures, 4 tables, 3 algorithms.

Introduction
The Hubness Threat
Real-World Attack Incidents
Challenges in Detection
Contributions
Related Work
Hubness in High-Dimensional Spaces
Hubness Attacks
Common Retrieval Methods
Methodology
Threat Model and Problem Definition
Hubness Background
System Architecture
Query Sampling Strategy
Robust Statistical Framework
...and 34 more sections

Figures (5)

Figure 1: HubScan detection pipeline overview showing the multi-stage process from input to verdict assignment.
Figure 2: Key detection metrics and their interpretation: Hub z-score measures statistical anomaly, cluster entropy captures cross-cluster spread, stability indicates robustness to perturbations, and combined scores provide holistic risk assessment.
Figure 3: Three detection modes: Global detection analyzes all queries together, Domain-Aware detection groups queries by semantic domains, and Modality-Aware detection handles cross-modal attacks.
Figure 4: Cross-modal hub detection: A hub (red star) positioned at the intersection of text and image modalities appears in top-$k$ results for queries from both modalities, exploiting the modality boundary.
Figure 5: Hubness score distribution for normal documents vs. planted adversarial hubs on Food-101 and MS-COCO. Normal items cluster near zero while adversarial hubs are extreme outliers (z-score $>$20). At a 0.1% alert budget, all planted hubs rank above the 99.8th percentile, achieving 100% recall with minimal false positives.

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Abstract

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Authors

Abstract

Table of Contents

Figures (5)