Security Attacks on LLM-based Code Completion Tools

Wen Cheng; Ke Sun; Xinyu Zhang; Wei Wang

Security Attacks on LLM-based Code Completion Tools

Wen Cheng, Ke Sun, Xinyu Zhang, Wei Wang

TL;DR

This paper investigates security vulnerabilities in LLM-based Code Completion Tools (LCCTs), focusing on jailbreaking and training data extraction attacks that exploit LCCT-specific input workflows and proprietary training data. It introduces a structured attack framework—Contextual Information Aggregation, Hierarchical Code Exploitation, and Code-Driven Privacy Extraction—and demonstrates exceptionally high jailbreaking success on Copilot (99.4%) and notable privacy leakage (emails, locations) from training data. The experiments extend to general LLMs (GPT-3.5, GPT-4, GPT-4o) to reveal broader misalignment in code-handling security, and include ablations showing the importance of guiding prompts, embedding strategies, and programming language choices. The findings underscore urgent needs for defense-in-depth at both input processing and output post-processing to mitigate privacy risks and unsafe outputs in LCCTs as their adoption grows.

Abstract

The rapid development of large language models (LLMs) has significantly advanced code completion capabilities, giving rise to a new generation of LLM-based Code Completion Tools (LCCTs). Unlike general-purpose LLMs, these tools possess unique workflows, integrating multiple information sources as input and prioritizing code suggestions over natural language interaction, which introduces distinct security challenges. Additionally, LCCTs often rely on proprietary code datasets for training, raising concerns about the potential exposure of sensitive data. This paper exploits these distinct characteristics of LCCTs to develop targeted attack methodologies on two critical security risks: jailbreaking and training data extraction attacks. Our experimental results expose significant vulnerabilities within LCCTs, including a 99.4% success rate in jailbreaking attacks on GitHub Copilot and a 46.3% success rate on Amazon Q. Furthermore, We successfully extracted sensitive user data from GitHub Copilot, including 54 real email addresses and 314 physical addresses associated with GitHub usernames. Our study also demonstrates that these code-based attack methods are effective against general-purpose LLMs, such as the GPT series, highlighting a broader security misalignment in the handling of code by modern LLMs. These findings underscore critical security challenges associated with LCCTs and suggest essential directions for strengthening their security frameworks. The example code and attack samples from our research are provided at https://github.com/Sensente/Security-Attacks-on-LCCTs.

Security Attacks on LLM-based Code Completion Tools

TL;DR

Abstract

Security Attacks on LLM-based Code Completion Tools

Authors

TL;DR

Abstract

Table of Contents

Figures (10)