Table of Contents
Fetching ...

ATLANTIS: AI-driven Threat Localization, Analysis, and Triage Intelligence System

Taesoo Kim, HyungSeok Han, Soyeon Park, Dae R. Jeong, Dohyeok Kim, Dongkwan Kim, Eunsoo Kim, Jiho Kim, Joshua Wang, Kangsu Kim, Sangwoo Ji, Woosun Song, Hanqing Zhao, Andrew Chin, Gyejin Lee, Kevin Stevens, Mansour Alharthi, Yizhuo Zhai, Cen Zhang, Joonun Jang, Yeongjin Jang, Ammar Askar, Dongju Kim, Fabian Fleischer, Jeongin Cho, Junsik Kim, Kyungjoon Ko, Insu Yun, Sangdon Park, Dowoo Baik, Haein Lee, Hyeon Heo, Minjae Gwon, Minjae Lee, Minwoo Baek, Seunggi Min, Wonyoung Kim, Yonghwi Jin, Younggi Park, Yunjae Choi, Jinho Jung, Gwanhyun Lee, Junyoung Jang, Kyuheon Kim, Yeonghyeon Cha, Youngjoon Kim

TL;DR

ATLANTIS integrates AI with traditional program analysis to create autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at scale. The project demonstrates a multi-module, multilingual approach (Atlantis, Atlantis-C, Atlantis-Java, Atlantis-Multilang) that combines LLMS, symbolic execution, fuzzing ensembles, sinkpoint-aware strategies, and patch-generation frameworks (Crete) to achieve high patch success rates across real OSS CPs. A key contribution is the orchestration of resource-aware, cross-language workflows that optimize LLM usage, budget constraints, and latency, while maintaining rigorous validation via SARIF and PoVs. The work showcases substantive improvements in vulnerability discovery, patch generation, and SARIF validation, underscoring the practical viability of AI-driven cyber reasoning at scale and providing reproducible artifacts for future research.

Abstract

We present ATLANTIS, the cyber reasoning system developed by Team Atlanta that won 1st place in the Final Competition of DARPA's AI Cyber Challenge (AIxCC) at DEF CON 33 (August 2025). AIxCC (2023-2025) challenged teams to build autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at the speed and scale of modern software. ATLANTIS integrates large language models (LLMs) with program analysis -- combining symbolic execution, directed fuzzing, and static analysis -- to address limitations in automated vulnerability discovery and program repair. Developed by researchers at Georgia Institute of Technology, Samsung Research, KAIST, and POSTECH, the system addresses core challenges: scaling across diverse codebases from C to Java, achieving high precision while maintaining broad coverage, and producing semantically correct patches that preserve intended behavior. We detail the design philosophy, architectural decisions, and implementation strategies behind ATLANTIS, share lessons learned from pushing the boundaries of automated security when program analysis meets modern AI, and release artifacts to support reproducibility and future research.

ATLANTIS: AI-driven Threat Localization, Analysis, and Triage Intelligence System

TL;DR

ATLANTIS integrates AI with traditional program analysis to create autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at scale. The project demonstrates a multi-module, multilingual approach (Atlantis, Atlantis-C, Atlantis-Java, Atlantis-Multilang) that combines LLMS, symbolic execution, fuzzing ensembles, sinkpoint-aware strategies, and patch-generation frameworks (Crete) to achieve high patch success rates across real OSS CPs. A key contribution is the orchestration of resource-aware, cross-language workflows that optimize LLM usage, budget constraints, and latency, while maintaining rigorous validation via SARIF and PoVs. The work showcases substantive improvements in vulnerability discovery, patch generation, and SARIF validation, underscoring the practical viability of AI-driven cyber reasoning at scale and providing reproducible artifacts for future research.

Abstract

We present ATLANTIS, the cyber reasoning system developed by Team Atlanta that won 1st place in the Final Competition of DARPA's AI Cyber Challenge (AIxCC) at DEF CON 33 (August 2025). AIxCC (2023-2025) challenged teams to build autonomous cyber reasoning systems capable of discovering and patching vulnerabilities at the speed and scale of modern software. ATLANTIS integrates large language models (LLMs) with program analysis -- combining symbolic execution, directed fuzzing, and static analysis -- to address limitations in automated vulnerability discovery and program repair. Developed by researchers at Georgia Institute of Technology, Samsung Research, KAIST, and POSTECH, the system addresses core challenges: scaling across diverse codebases from C to Java, achieving high precision while maintaining broad coverage, and producing semantically correct patches that preserve intended behavior. We detail the design philosophy, architectural decisions, and implementation strategies behind ATLANTIS, share lessons learned from pushing the boundaries of automated security when program analysis meets modern AI, and release artifacts to support reproducibility and future research.

Paper Structure

This paper contains 164 sections, 8 equations, 64 figures, 26 tables, 5 algorithms.

Figures (64)

  • Figure 1: The overview of Atlantis.
  • Figure 2: Overall Design of Atlantis-C
  • Figure 3: Bullseye architecture. During static analysis, landmark selection and distance calculation are performed, and the binary is instrumented with this information. During fuzzing, landmarks are used to calculate discovery, while the power scheduler leverages distance and discovery metrics to allocate seed energy. Additionally, a queue of favored seeds is maintained, prioritizing those with higher landmark hits and better distance scores.
  • Figure 4: Example CPV from AIxCC Semifinal Jenkins CP
  • Figure 5: Overview of Atlantis-Java
  • ...and 59 more figures

Theorems & Definitions (1)

  • Remark