Testing Updated Apps by Adapting Learned Models
Chanh-Duc Ngo, Fabrizio Pastore, Lionel Briand
TL;DR
This work tackles the inefficiency of regression testing for frequently updated mobile Apps by repurposing learned models from prior app versions. The authors introduce CALM, which incrementally adapts an App model across versions using static-dynamic analysis, RCVDiff-based GUI differences, and runtime DSTG adaptation, along with layout guards, probabilistic action sequences, backward-equivalent state detection, and online/offline refinement. Empirical results on 52 app versions show CALM generally achieves higher coverage of updated methods/instructions and substantially reduces test oracle cost (fewer outputs to inspect) compared with ATUA and several SOTA tools, especially for small updates. The findings indicate CALM’s practical impact: enabling faster, more reliable testing of updated features with a constrained manual verification burden, while still identifying functional faults more effectively than key baselines. Overall, CALM demonstrates that strategic reuse and adaptation of learned models across app versions can significantly improve update testing efficiency and effectiveness in real-world Android apps.
Abstract
Although App updates are frequent and software engineers would like to verify updated features only, automated testing techniques verify entire Apps and are thus wasting resources. We present Continuous Adaptation of Learned Models (CALM), an automated App testing approach that efficiently test App updates by adapting App models learned when automatically testing previous App versions. CALM focuses on functional testing. Since functional correctness can be mainly verified through the visual inspection of App screens, CALM minimizes the number of App screens to be visualized by software testers while maximizing the percentage of updated methods and instructions exercised. Our empirical evaluation shows that CALM exercises a significantly higher proportion of updated methods and instructions than six state-of-the-art approaches, for the same maximum number of App screens to be visually inspected. Further, in common update scenarios, where only a small fraction of methods are updated, CALM is even quicker to outperform all competing approaches in a more significant way.
