Table of Contents
Fetching ...

LLM-Driven Kernel Evolution: Automating Driver Updates in Linux

Arina Kharlamova, Jiawen Liu, Tianyi Zhang, Xinrui Yang, Humaid Alqasimi, Youcheng Sun, Chun Jason Xue

TL;DR

This work tackles the persistent problem of Linux kernel–driver maintenance amid ongoing kernel evolution. It introduces DriveBench, a reproducible executable corpus of kernel–driver co-evolution cases, and AutoDriver, a closed-loop, multi-agent, LLM-driven system that localizes changes, synthesizes patches, and validates them through static, compile-time, and runtime checks. The approach hinges on taxonomy-guided prompting, dependency-aware localization using Linux LSP, and a staged refinement loop that integrates build and QEMU-based validation to preserve functional and security invariants. Empirical evaluation demonstrates that taxonomy-aware prompting and iterative refinement improve patch quality and compilation success, with DeepSeek outperforming a general-purpose model in structural alignment and runtime viability. Overall, the framework provides a practical, reproducible path toward continuous co-evolution of drivers with the Linux kernel, with potential applicability to other large evolving codebases and safety-critical software domains.

Abstract

Linux kernel evolution breaks drivers through API/ABI changes, semantic shifts, and security-hardening updates. We introduce DRIVEBENCH, an executable corpus of kernel$\rightarrow$driver co-evolution cases, and AUTODRIVER, a closed-loop, LLM-driven system for automating driver maintenance. The system integrates prompt engineering, multi-agent collaboration, static analysis, and iterative validation to ensure that generated patches are not only syntactically correct but also functionally and semantically consistent with kernel conventions. The corpus spans v5.10-v6.10 with 235 validated cases drawn from 612 candidates. In evaluation across 55 cases, AUTODRIVER achieves 56.4% compilation success; QEMU-based boot verification indicates that compiled patches preserve driver initialization in most instances. By releasing DRIVEBENCH and tooling, we enable reproducible research and a practical route to continuous, safe co-evolution of drivers with the Linux kernel.

LLM-Driven Kernel Evolution: Automating Driver Updates in Linux

TL;DR

This work tackles the persistent problem of Linux kernel–driver maintenance amid ongoing kernel evolution. It introduces DriveBench, a reproducible executable corpus of kernel–driver co-evolution cases, and AutoDriver, a closed-loop, multi-agent, LLM-driven system that localizes changes, synthesizes patches, and validates them through static, compile-time, and runtime checks. The approach hinges on taxonomy-guided prompting, dependency-aware localization using Linux LSP, and a staged refinement loop that integrates build and QEMU-based validation to preserve functional and security invariants. Empirical evaluation demonstrates that taxonomy-aware prompting and iterative refinement improve patch quality and compilation success, with DeepSeek outperforming a general-purpose model in structural alignment and runtime viability. Overall, the framework provides a practical, reproducible path toward continuous co-evolution of drivers with the Linux kernel, with potential applicability to other large evolving codebases and safety-critical software domains.

Abstract

Linux kernel evolution breaks drivers through API/ABI changes, semantic shifts, and security-hardening updates. We introduce DRIVEBENCH, an executable corpus of kerneldriver co-evolution cases, and AUTODRIVER, a closed-loop, LLM-driven system for automating driver maintenance. The system integrates prompt engineering, multi-agent collaboration, static analysis, and iterative validation to ensure that generated patches are not only syntactically correct but also functionally and semantically consistent with kernel conventions. The corpus spans v5.10-v6.10 with 235 validated cases drawn from 612 candidates. In evaluation across 55 cases, AUTODRIVER achieves 56.4% compilation success; QEMU-based boot verification indicates that compiled patches preserve driver initialization in most instances. By releasing DRIVEBENCH and tooling, we enable reproducible research and a practical route to continuous, safe co-evolution of drivers with the Linux kernel.

Paper Structure

This paper contains 56 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Scheme 1. LLM-assisted co-evolution of Linux kernel and drivers. Kernel evolution introduces API/ABI, semantic, and security changes that create a mismatch zone (e.g., breakages, regressions, and maintenance burden) between the mainline kernel and dependent drivers. DRIVEBENCH constructs an executable corpus from kernel updates and observed breakage episodes; AUTODRIVER consumes this corpus to run an LLM-driven adaptation and validation loop, returning validated patches / adapted drivers and thereby closing the gap.
  • Figure 2: Distribution of code churn across kernel subsystems (Linux v5.10–v6.10). Inner ring shows the share of commits per subsystem. Outer ring shows the breakdown by number of files modified per commit. Colors: blue — drivers, green — arch, red — include, orange — fs.
  • Figure 3: Conceptual architecture of DriveBench, aligning the semantic, analytical, and systemic layers into a unified causal data system.
  • Figure 4: Overview of the Multi-Agent LLM System for Kernel–Driver Adaptation. The framework orchestrates four cooperative agents—Prompt Engineering, Coding, Patch Fix, and Static Analysis—within a closed-loop refinement cycle. Starting from a Driver Case Pack (JSON) containing driver code, kernel metadata, and taxonomy labels, the agents iteratively synthesize, verify, and repair driver patches under the Iterative Error Correction and Self-Learning loop. Upon passing static validation, the pipeline advances through Docker Compilation, Linux QEMU Testing, and Runtime Validation, which collectively ensure functional and security correctness under execution. The resulting patch and runtime report constitute a fully validated output, ready for inclusion in DriveBench.