Observation-based unit test generation at Meta
Nadia Alshahwan, Mark Harman, Alexandru Marginean, Rotem Tal, Eddy Wang
TL;DR
TestGen tackles industrial-scale regression test generation by carving unit tests from real runtime observations. It introduces DASAD, a depth-aware serialization and pointer-aware memory management framework, integrated into a four-component pipeline (Instrumenter, ObservationLogger, TestGenerator, TestPublisher) to produce runnable, diff-friendly tests. Empirically, TestGen landed 518 tests, enabled 9.6 million CI executions, and uncovered 5,702 faults; it achieved at least 86% coverage of files used by end-to-end tests and could have prevented 81% of past high-impact regressions in Kotlin launch-blocking tasks. The work demonstrates the practicality of scalable, observation-based test generation at Meta scale and informs wider deployment across Meta platforms.
Abstract
TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution. We describe the development and deployment of TestGen at Meta. In particular, we focus on the scalability challenges overcome during development in order to deploy observation-based test carving at scale in industry. So far, TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults. Meta is currently in the process of more widespread deployment. Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86\% of the classes covered by end-to-end tests. Testing on 16 Kotlin Instagram app-launch-blocking tasks demonstrated that the TestGen tests would have trapped 13 of these before they became launch blocking.
