Table of Contents
Fetching ...

RippleGUItester: Change-Aware Exploratory Testing

Yanqi Su, Michael Pradel, Chunyang Chen

TL;DR

RippleGUItester is a change-driven testing system that treats a code change as the epicenter of a ripple effect and explores its broader, user-visible impacts via the GUI, and employs multimodal bug detection to distinguish unintended bugs from intended behavioral updates.

Abstract

Software systems evolve continuously through frequent code changes, yet such changes often introduce unintended bugs despite extensive testing and code review. Existing testing approaches are largely constrained to predefined execution paths or rely on unguided exploration, leaving many change-induced issues undetected. To address this challenge, we present RippleGUItester, a change-driven testing system that treats a code change as the epicenter of a ripple effect and explores its broader, user-visible impacts via the GUI. Given a code change, RippleGUItester performs LLM-based change-impact analysis to generate and enrich realistic test scenarios, executes these scenarios on both pre-change and post-change versions of the system, and applies differential analysis to identify behavioral differences. Crucially, RippleGUItester employs multimodal bug detection, comparing visual GUI changes and interpreting them in the context of natural-language change intents to distinguish unintended bugs from intended behavioral updates. We evaluate our approach on hundreds of real-world code changes across four widely used software systems: Firefox, Zettlr, JabRef, and Godot. Our results show that the proposed approach uncovers bugs introduced by code changes that were missed by existing test suites, CI pipelines, and code review. In total, we identify 26 previously unknown bugs that still exist in the latest versions of the evaluated systems. After reporting, 16 bugs have been fixed, 2 have been confirmed, 6 are still under discussion, and 2 were marked as intended. We envision RippleGUItester being applied before or shortly after a code change is merged, enabling earlier detection of regressions.

RippleGUItester: Change-Aware Exploratory Testing

TL;DR

RippleGUItester is a change-driven testing system that treats a code change as the epicenter of a ripple effect and explores its broader, user-visible impacts via the GUI, and employs multimodal bug detection to distinguish unintended bugs from intended behavioral updates.

Abstract

Software systems evolve continuously through frequent code changes, yet such changes often introduce unintended bugs despite extensive testing and code review. Existing testing approaches are largely constrained to predefined execution paths or rely on unguided exploration, leaving many change-induced issues undetected. To address this challenge, we present RippleGUItester, a change-driven testing system that treats a code change as the epicenter of a ripple effect and explores its broader, user-visible impacts via the GUI. Given a code change, RippleGUItester performs LLM-based change-impact analysis to generate and enrich realistic test scenarios, executes these scenarios on both pre-change and post-change versions of the system, and applies differential analysis to identify behavioral differences. Crucially, RippleGUItester employs multimodal bug detection, comparing visual GUI changes and interpreting them in the context of natural-language change intents to distinguish unintended bugs from intended behavioral updates. We evaluate our approach on hundreds of real-world code changes across four widely used software systems: Firefox, Zettlr, JabRef, and Godot. Our results show that the proposed approach uncovers bugs introduced by code changes that were missed by existing test suites, CI pipelines, and code review. In total, we identify 26 previously unknown bugs that still exist in the latest versions of the evaluated systems. After reporting, 16 bugs have been fixed, 2 have been confirmed, 6 are still under discussion, and 2 were marked as intended. We envision RippleGUItester being applied before or shortly after a code change is merged, enabling earlier detection of regressions.
Paper Structure (36 sections, 9 figures, 1 table)

This paper contains 36 sections, 9 figures, 1 table.

Figures (9)

  • Figure 1: Fixing https://bugzilla.mozilla.org/show_bug.cgi?id=1858633 introduces https://bugzilla.mozilla.org/show_bug.cgi?id=1937085. Only the final event is shown.
  • Figure 2: Approach Overview. Icons denote types: LLM-based (robot), algorithmic (gear), and hybrid (handshake).
  • Figure 3: Input Description in Prompt
  • Figure 4: Test Scenario Generation and Enrichment for the PR fixing https://bugzilla.mozilla.org/show_bug.cgi?id=1858633.
  • Figure 5: Test Scenario Execution for PR Fixing https://bugzilla.mozilla.org/show_bug.cgi?id=1858633
  • ...and 4 more figures