Testing Updated Apps by Adapting Learned Models

Chanh-Duc Ngo; Fabrizio Pastore; Lionel Briand

Testing Updated Apps by Adapting Learned Models

Chanh-Duc Ngo, Fabrizio Pastore, Lionel Briand

TL;DR

This work tackles the inefficiency of regression testing for frequently updated mobile Apps by repurposing learned models from prior app versions. The authors introduce CALM, which incrementally adapts an App model across versions using static-dynamic analysis, RCVDiff-based GUI differences, and runtime DSTG adaptation, along with layout guards, probabilistic action sequences, backward-equivalent state detection, and online/offline refinement. Empirical results on 52 app versions show CALM generally achieves higher coverage of updated methods/instructions and substantially reduces test oracle cost (fewer outputs to inspect) compared with ATUA and several SOTA tools, especially for small updates. The findings indicate CALM’s practical impact: enabling faster, more reliable testing of updated features with a constrained manual verification burden, while still identifying functional faults more effectively than key baselines. Overall, CALM demonstrates that strategic reuse and adaptation of learned models across app versions can significantly improve update testing efficiency and effectiveness in real-world Android apps.

Abstract

Although App updates are frequent and software engineers would like to verify updated features only, automated testing techniques verify entire Apps and are thus wasting resources. We present Continuous Adaptation of Learned Models (CALM), an automated App testing approach that efficiently test App updates by adapting App models learned when automatically testing previous App versions. CALM focuses on functional testing. Since functional correctness can be mainly verified through the visual inspection of App screens, CALM minimizes the number of App screens to be visualized by software testers while maximizing the percentage of updated methods and instructions exercised. Our empirical evaluation shows that CALM exercises a significantly higher proportion of updated methods and instructions than six state-of-the-art approaches, for the same maximum number of App screens to be visually inspected. Further, in common update scenarios, where only a small fraction of methods are updated, CALM is even quicker to outperform all competing approaches in a more significant way.

Testing Updated Apps by Adapting Learned Models

TL;DR

Abstract

Paper Structure (34 sections, 10 equations, 18 figures, 7 tables)

This paper contains 34 sections, 10 equations, 18 figures, 7 tables.

Introduction
Background
Model-based App Testing with ATUA
RCVDiff
Proposed Approach: CALM
Step 1: Detect EWTG differences
Step 2: Generate an Updated App model
Step 3: Automated testing with runtime DSTG adaptation
Layout-guarded abstract transitions mitigating non-deterministic AbstractTransitions
Probabilistic Action sequences to handle state explosion
Backward-equivalent abstract states detection
Online App model refinement to deal with obsolete abstract states
Step 4: Refine the App model offline.
Empirical Evaluation
Subjects of the study
...and 19 more sections

Figures (18)

Figure 1: App Model Metamodel. Colors are used to group classes belonging to a specific metamodel component: GSTG (orange, top), DSTG (green, middle), EWTG (light blue, bottom). Classes in red are specific to CALM.
Figure 2: CALM App testing process
Figure 3: Example of action output provided to the end-user
Figure 4: An example of RCVDiff Model of EWTGs belonging to two App versions.
Figure 5: Illustration of how DSTG of Base App model is adapted in Updated App model accordingly to the RCVDiff model in Figure \ref{['fig:rcvdiff:wtgdiff']}
...and 13 more figures

Testing Updated Apps by Adapting Learned Models

TL;DR

Abstract

Testing Updated Apps by Adapting Learned Models

Authors

TL;DR

Abstract

Table of Contents

Figures (18)