Security of Language Models for Code: A Systematic Literature Review
Yuchen Chen, Weisong Sun, Chunrong Fang, Zhenpeng Chen, Yifei Ge, Tingxu Han, Quanjun Zhang, Yang Liu, Zhenyu Chen, Baowen Xu
TL;DR
This systematic literature review analyzes security aspects of language models for code (CodeLMs), organizing 68 papers around backdoor and adversarial threats, defenses, empirical findings, and experimental practices. It reveals that backdoor data poisoning remains prevalent while black-box adversarial attacks dominate practical evaluations, with defenses still in early stages and often trade off model performance. The study provides a structured taxonomy, a consolidated view of datasets and metrics, and a curated set of open-source artifacts to promote reproducibility and future work. The findings underscore the need for multi-stage, explainable defenses and standardized evaluation to bolster the security of CodeLMs in real-world software engineering settings.
Abstract
Language models for code (CodeLMs) have emerged as powerful tools for code-related tasks, outperforming traditional methods and standard machine learning approaches. However, these models are susceptible to security vulnerabilities, drawing increasing research attention from domains such as software engineering, artificial intelligence, and cybersecurity. Despite the growing body of research focused on the security of CodeLMs, a comprehensive survey in this area remains absent. To address this gap, we systematically review 67 relevant papers, organizing them based on attack and defense strategies. Furthermore, we provide an overview of commonly used language models, datasets, and evaluation metrics, and highlight open-source tools and promising directions for future research in securing CodeLMs.
