Table of Contents
Fetching ...

Software Vulnerability Analysis Across Programming Language and Program Representation Landscapes: A Survey

Zhuoyun Qian, Fangtian Zhong, Qin Hu, Yili Jiang, Jiaqi Huang, Mengfei Ren, Jiguo Yu

TL;DR

This survey addresses the fragmentation in software vulnerability detection by systematically comparing detection techniques across multiple programming languages, program representations, and bug types. It covers binary, IR, and source-code analyses for C/C++/Rust, Java/Android, and JavaScript/PHP/Python, highlighting static, dynamic, hybrid, and ML-driven methods. The authors propose a language-aware framework and provide a reproducible workflow, including a public repository with datasets and code. The work identifies key gaps—especially cross-language source-level analysis and benchmarking—and outlines direction for robust, scalable vulnerability detection applicable to heterogeneous, polyglot software ecosystems.

Abstract

Modern software systems are developed in diverse programming languages and often harbor critical vulnerabilities that attackers can exploit to compromise security. These vulnerabilities have been actively targeted in real-world attacks, causing substantial harm to users and cyberinfrastructure. Since many of these flaws originate from the code itself, a variety of techniques have been proposed to detect and mitigate them prior to software deployment. However, a comprehensive comparative study that spans different programming languages, program representations, bug types, and analysis techniques is still lacking. As a result, the relationships among programming languages, abstraction levels, vulnerability types, and detection approaches remain fragmented, and the limitations and research gaps across the landscape are not clearly understood. This article aims to bridge that gap by systematically examining widely used programming languages, levels of program representation, categories of vulnerabilities, and mainstream detection techniques. The survey provides a detailed understanding of current practices in vulnerability discovery, highlighting their strengths, limitations, and distinguishing characteristics. Furthermore, it identifies persistent challenges and outlines promising directions for future research in the field of software security.

Software Vulnerability Analysis Across Programming Language and Program Representation Landscapes: A Survey

TL;DR

This survey addresses the fragmentation in software vulnerability detection by systematically comparing detection techniques across multiple programming languages, program representations, and bug types. It covers binary, IR, and source-code analyses for C/C++/Rust, Java/Android, and JavaScript/PHP/Python, highlighting static, dynamic, hybrid, and ML-driven methods. The authors propose a language-aware framework and provide a reproducible workflow, including a public repository with datasets and code. The work identifies key gaps—especially cross-language source-level analysis and benchmarking—and outlines direction for robust, scalable vulnerability detection applicable to heterogeneous, polyglot software ecosystems.

Abstract

Modern software systems are developed in diverse programming languages and often harbor critical vulnerabilities that attackers can exploit to compromise security. These vulnerabilities have been actively targeted in real-world attacks, causing substantial harm to users and cyberinfrastructure. Since many of these flaws originate from the code itself, a variety of techniques have been proposed to detect and mitigate them prior to software deployment. However, a comprehensive comparative study that spans different programming languages, program representations, bug types, and analysis techniques is still lacking. As a result, the relationships among programming languages, abstraction levels, vulnerability types, and detection approaches remain fragmented, and the limitations and research gaps across the landscape are not clearly understood. This article aims to bridge that gap by systematically examining widely used programming languages, levels of program representation, categories of vulnerabilities, and mainstream detection techniques. The survey provides a detailed understanding of current practices in vulnerability discovery, highlighting their strengths, limitations, and distinguishing characteristics. Furthermore, it identifies persistent challenges and outlines promising directions for future research in the field of software security.

Paper Structure

This paper contains 52 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Classification of Bugs
  • Figure 2: Overview of The Survey
  • Figure 3: Overview of Analysis Techniques