Table of Contents
Fetching ...

Malware analysis assisted by AI with R2AI

Axelle Apvrille, Daniel Nakov

TL;DR

The paper evaluates AI-assisted malware analysis using r2ai, a Radare2 plugin, on Linux and IoT malware samples from 2024–2025. It compares multiple LLMs, finding Claude 3.5/3.7 Sonnet to deliver the best decompilation quality and explanatory output, while human analysts are still essential to guide and verify AI results. AI assistance accelerates analysis and often matches or surpasses human-only analyses, but introduces challenges such as hallucinations, omissions, and context-length limits that require vigilant oversight. Costs depend on the chosen mode and model, with auto mode offering faster results at higher expense, yet overall often remaining cheaper than hiring dedicated analysts when properly managed. The work demonstrates practical AI-assisted reverse engineering workflows, highlighting both the speedups and safety considerations necessary for deployment in malware analysis practice.

Abstract

This research studies the quality, speed and cost of malware analysis assisted by artificial intelligence. It focuses on Linux and IoT malware of 2024-2025, and uses r2ai, the AI extension of Radare2's disassembler. Not all malware and not all LLMs are equivalent but the study shows excellent results with Claude 3.5 and 3.7 Sonnet. Despite a few errors, the quality of analysis is overall equal or better than without AI assistance. For good results, the AI cannot operate alone and must constantly be guided by an experienced analyst. The gain of speed is largely visible with AI assistance, even when taking account the time to understand AI's hallucinations, exaggerations and omissions. The cost is usually noticeably lower than the salary of a malware analyst, but attention and guidance is needed to keep it under control in cases where the AI would naturally loop without showing progress.

Malware analysis assisted by AI with R2AI

TL;DR

The paper evaluates AI-assisted malware analysis using r2ai, a Radare2 plugin, on Linux and IoT malware samples from 2024–2025. It compares multiple LLMs, finding Claude 3.5/3.7 Sonnet to deliver the best decompilation quality and explanatory output, while human analysts are still essential to guide and verify AI results. AI assistance accelerates analysis and often matches or surpasses human-only analyses, but introduces challenges such as hallucinations, omissions, and context-length limits that require vigilant oversight. Costs depend on the chosen mode and model, with auto mode offering faster results at higher expense, yet overall often remaining cheaper than hiring dedicated analysts when properly managed. The work demonstrates practical AI-assisted reverse engineering workflows, highlighting both the speedups and safety considerations necessary for deployment in malware analysis practice.

Abstract

This research studies the quality, speed and cost of malware analysis assisted by artificial intelligence. It focuses on Linux and IoT malware of 2024-2025, and uses r2ai, the AI extension of Radare2's disassembler. Not all malware and not all LLMs are equivalent but the study shows excellent results with Claude 3.5 and 3.7 Sonnet. Despite a few errors, the quality of analysis is overall equal or better than without AI assistance. For good results, the AI cannot operate alone and must constantly be guided by an experienced analyst. The gain of speed is largely visible with AI assistance, even when taking account the time to understand AI's hallucinations, exaggerations and omissions. The cost is usually noticeably lower than the salary of a malware analyst, but attention and guidance is needed to keep it under control in cases where the AI would naturally loop without showing progress.

Paper Structure

This paper contains 19 sections, 12 figures, 10 tables.

Figures (12)

  • Figure 1: Supported commands by r2ai plugin on April 8, 2025
  • Figure 2: Example of r2ai direct request
  • Figure 3: Example where the end-user issued command r2ai -d which decompiles a given function. In this particular case, the LLM was Mistral's codestral-latest. R2ai provides to the AI the function's pseudo code (output of r2 command pdc). An API key is used to access Mistral via the authorization header. Based on this context, the AI is expected to answer with corresponding code in C.
  • Figure 4: Questions pile in the context sent to the AI
  • Figure 5: Definition of the r2cmd tool, sent in a context to Anthropic Claude 3.7 Sonnet
  • ...and 7 more figures