Table of Contents
Fetching ...

Observation-based unit test generation at Meta

Nadia Alshahwan, Mark Harman, Alexandru Marginean, Rotem Tal, Eddy Wang

TL;DR

TestGen tackles industrial-scale regression test generation by carving unit tests from real runtime observations. It introduces DASAD, a depth-aware serialization and pointer-aware memory management framework, integrated into a four-component pipeline (Instrumenter, ObservationLogger, TestGenerator, TestPublisher) to produce runnable, diff-friendly tests. Empirically, TestGen landed 518 tests, enabled 9.6 million CI executions, and uncovered 5,702 faults; it achieved at least 86% coverage of files used by end-to-end tests and could have prevented 81% of past high-impact regressions in Kotlin launch-blocking tasks. The work demonstrates the practicality of scalable, observation-based test generation at Meta scale and informs wider deployment across Meta platforms.

Abstract

TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution. We describe the development and deployment of TestGen at Meta. In particular, we focus on the scalability challenges overcome during development in order to deploy observation-based test carving at scale in industry. So far, TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults. Meta is currently in the process of more widespread deployment. Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86\% of the classes covered by end-to-end tests. Testing on 16 Kotlin Instagram app-launch-blocking tasks demonstrated that the TestGen tests would have trapped 13 of these before they became launch blocking.

Observation-based unit test generation at Meta

TL;DR

TestGen tackles industrial-scale regression test generation by carving unit tests from real runtime observations. It introduces DASAD, a depth-aware serialization and pointer-aware memory management framework, integrated into a four-component pipeline (Instrumenter, ObservationLogger, TestGenerator, TestPublisher) to produce runnable, diff-friendly tests. Empirically, TestGen landed 518 tests, enabled 9.6 million CI executions, and uncovered 5,702 faults; it achieved at least 86% coverage of files used by end-to-end tests and could have prevented 81% of past high-impact regressions in Kotlin launch-blocking tasks. The work demonstrates the practicality of scalable, observation-based test generation at Meta scale and informs wider deployment across Meta platforms.

Abstract

TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution. We describe the development and deployment of TestGen at Meta. In particular, we focus on the scalability challenges overcome during development in order to deploy observation-based test carving at scale in industry. So far, TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults. Meta is currently in the process of more widespread deployment. Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86\% of the classes covered by end-to-end tests. Testing on 16 Kotlin Instagram app-launch-blocking tasks demonstrated that the TestGen tests would have trapped 13 of these before they became launch blocking.
Paper Structure (18 sections, 3 figures, 3 tables, 2 algorithms)

This paper contains 18 sections, 3 figures, 3 tables, 2 algorithms.

Figures (3)

  • Figure 1: The Architecture of TestGen. TestGen instruments the app source code to record observations at runtime using the ObservationLogger. The TestGenerator uses the observations and the app source code to produce fully-runnable unit tests. Finally, the TestPublisher integrates with the build system and publishes a diff with the tests.
  • Figure 2: Kotlin Intermediate Representation (IR) of an example instrumentation. is an instance of the observation logger. is its method that logs the observations to the app DB. The grayed code is code that the instrumenter has added.
  • Figure 3: An Example generated test class. Some commercially sensitive details have been elided. The test method: 'test for $METHOD_UNDER_TEST_NAME $TEST_UUID' asserts that when called with the previously observed parameters on the previously observed object state, the method returns the previously observed value (lines 26--34)