Table of Contents
Fetching ...

Mica: Automated Differential Testing for OCaml Modules

Ernest Ng, Harrison Goldstein, Benjamin C. Pierce

TL;DR

Mica addresses the challenge of verifying observational equivalence between OCaml modules implementing the same signature by automating differential testing through a PPX extension that derives specialized QuickCheck-style code from minimal annotations. Its design centers on an ADT-based symbolic-expression representation, a type-directed random expression generator, an interpretation functor, and a TestHarness that compares two modules’ observable behavior, with support for mutable state and polymorphism. Empirically, Mica detects real bugs across real libraries and student submissions, and its integration with Tyche provides visual insight into testing coverage, highlighting practical impact. The work situates itself among differential testing and model-based approaches while offering automatic generation of well-typed symbolic expressions, and outlines clear paths for extending support to functors, broader abstractions, and alternative PBT backends.

Abstract

Suppose we are given two OCaml modules implementing the same signature. How do we check that they are observationally equivalent -- that is, that they behave the same on all inputs? One established technique is to use a property-based testing (PBT) tool such as QuickCheck. Currently, however, this can require significant amounts of boilerplate code and ad-hoc test harnesses. To address this issue, we present Mica, an automated tool for testing observational equivalence of OCaml modules. Mica is implemented as a PPX compiler extension, allowing users to supply minimal annotations to a module signature. These annotations guide Mica to automatically derive specialized PBT code that checks observational equivalence. We discuss the design of Mica and demonstrate its efficacy as a testing tool on various modules taken from real-world OCaml libraries.

Mica: Automated Differential Testing for OCaml Modules

TL;DR

Mica addresses the challenge of verifying observational equivalence between OCaml modules implementing the same signature by automating differential testing through a PPX extension that derives specialized QuickCheck-style code from minimal annotations. Its design centers on an ADT-based symbolic-expression representation, a type-directed random expression generator, an interpretation functor, and a TestHarness that compares two modules’ observable behavior, with support for mutable state and polymorphism. Empirically, Mica detects real bugs across real libraries and student submissions, and its integration with Tyche provides visual insight into testing coverage, highlighting practical impact. The work situates itself among differential testing and model-based approaches while offering automatic generation of well-typed symbolic expressions, and outlines clear paths for extending support to functors, broader abstractions, and alternative PBT backends.

Abstract

Suppose we are given two OCaml modules implementing the same signature. How do we check that they are observationally equivalent -- that is, that they behave the same on all inputs? One established technique is to use a property-based testing (PBT) tool such as QuickCheck. Currently, however, this can require significant amounts of boilerplate code and ad-hoc test harnesses. To address this issue, we present Mica, an automated tool for testing observational equivalence of OCaml modules. Mica is implemented as a PPX compiler extension, allowing users to supply minimal annotations to a module signature. These annotations guide Mica to automatically derive specialized PBT code that checks observational equivalence. We discuss the design of Mica and demonstrate its efficacy as a testing tool on various modules taken from real-world OCaml libraries.
Paper Structure (6 sections, 3 figures)

This paper contains 6 sections, 3 figures.

Figures (3)

  • Figure 1: Left: User code (note the annotation on signature S). Right: PBT code automatically derived by Mica.
  • Figure 2: The Tyche user interface, displaying Mica's test results
  • Figure 3: Average mean no. of trials required to provoke failure in an observational equivalence test