SoK: Understanding (New) Security Issues Across AI4Code Use Cases

Qilong Wu; Taoran Li; Tianyang Zhou; Varun Chandrasekaran

SoK: Understanding (New) Security Issues Across AI4Code Use Cases

Qilong Wu, Taoran Li, Tianyang Zhou, Varun Chandrasekaran

TL;DR

This SoK analyzes security across AI4Code use cases—code generation, vulnerability detection, and code translation—highlighting systemic gaps like Python monocultures, insecure outputs, and weak robustness. It synthesizes a broad experimental program examining misalignment, vulnerability reproduction, and translation effects, revealing that higher functional performance often coexists with weaker security and that robustness requires evaluation beyond standard accuracy. The work proposes security-by-default practices, robust benchmarks, and translation-based security refactoring as pathways to safer AI4Code deployment, and outlines 11 future directions to embed security throughout lifecycle workflows. Collectively, it reframes AI4Code development as security-first engineering, emphasizing adversarial resilience, privacy safeguards, and trustworthy governance across tools and pipelines.

Abstract

AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advances, however, security risks remain pervasive: insecure outputs, biased benchmarks, and susceptibility to adversarial manipulation undermine their reliability. This SoK surveys the landscape of AI4Code security across three core applications, identifying recurring gaps: benchmark dominance by Python and toy problems, lack of standardized security datasets, data leakage in evaluation, and fragile adversarial robustness. A comparative study of six state-of-the-art models illustrates these challenges: insecure patterns persist in code generation, vulnerability detection is brittle to semantic-preserving attacks, fine-tuning often misaligns security objectives, and code translation yields uneven security benefits. From this analysis, we distill three forward paths: embedding secure-by-default practices in code generation, building robust and comprehensive detection benchmarks, and leveraging translation as a route to security-enhanced languages. We call for a shift toward security-first AI4Code, where vulnerability mitigation and robustness are embedded throughout the development life cycle.

SoK: Understanding (New) Security Issues Across AI4Code Use Cases

TL;DR

Abstract

SoK: Understanding (New) Security Issues Across AI4Code Use Cases

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (19)