Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks

Irash Perera; Hiranya Abeyrathne; Sanjeewa Malalgoda; Arshardh Ifthikar

Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks

Irash Perera, Hiranya Abeyrathne, Sanjeewa Malalgoda, Arshardh Ifthikar

TL;DR

This work tackles the security of GraphQL APIs by addressing threats unique to GraphQL queries, such as DoS, injections, XSS, and SSRF, through a real-time, AI-driven framework. It combines LLM-driven dynamic SDL analysis with SBERT/Doc2Vec embeddings and CNN/RF/MLP classifiers, enabling schema-aware, context-rich detection while mitigating DoS and SSRF risks. The approach is implemented in a production-oriented architecture featuring asynchronous processing, a centralized model loader, ONNX-accelerated inference, and parallel vulnerability detection, with data collection spanning SQLi, OS command, and XSS datasets. Results show high detection accuracy across threat vectors and provide a detailed analysis of performance under load, revealing CPU-bound ML inference as the main bottleneck and underscoring the need for accelerators to achieve real-time scalability in production.

Abstract

GraphQL's flexibility, while beneficial for efficient data fetching, introduces unique security vulnerabilities that traditional API security mechanisms often fail to address. Malicious GraphQL queries can exploit the language's dynamic nature, leading to denial-of-service attacks, data exfiltration through injection, and other exploits. Existing solutions, such as static analysis, rate limiting, and general-purpose Web Application Firewalls, offer limited protection against sophisticated, context-aware attacks. This paper presents a novel, AI-driven approach for real-time detection of malicious GraphQL queries. Our method combines static analysis with machine learning techniques, including Large Language Models (LLMs) for dynamic schema-based configuration, Sentence Transformers (SBERT and Doc2Vec) for contextual embedding of query payloads, and Convolutional Neural Networks (CNNs), Random Forests, and Multilayer Perceptrons for classification. We detail the system architecture, implementation strategies optimized for production environments (including ONNX Runtime optimization and parallel processing), and evaluate the performance of our detection models and the overall system under load. Results demonstrate high accuracy in detecting various threats, including SQL injection, OS command injection, and XSS exploits, alongside effective mitigation of DoS and SSRF attempts. This research contributes a robust and adaptable solution for enhancing GraphQL API security.

Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks

TL;DR

Abstract

Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)