Mend the gap: A smart repair algorithm for noisy polygonal tilings
Jeanne N. Clelland
TL;DR
This work tackles the problem of recovering an unknown, perfectly tiled region $T^*=igl\{P^*_1,...,P^*_N\bigr\}$ from a noisy polygonal tiling $T=igl\{P_1,...,P_N\bigr\}$ by constructing a repaired tiling $\widetilde{T}=\bigl\{\widetilde{P}_1,...,\widetilde{P}_N\bigr\}$ that closely matches $T^*$. The proposed smart_repair algorithm proceeds in four steps—refined tiling construction, overlap assignment, convexity-driven gap closing, and cleanup—while emphasizing preservation of adjacency relations and convexity, with optional region-aware nesting and rook-to-queen adjacencies. The authors introduce a rich set of geometric tools (outward convexity, strong mutual visibility, shortest-path subdivisions) to subdivide gaps and reassign pieces, and prove that the repaired tiling adheres to the desired adjacency principles. The method is implemented in the Maup Python package and demonstrates practical performance on large state maps (e.g., Colorado, Wisconsin, New York), producing more faithful adjacencies than Quick_repair and enabling robust GIS analyses and redistricting workflows. The work thus provides a principled, geometry-driven repair framework with direct applicability to geospatial data integrity and legal contiguity constraints.
Abstract
Let $T^* = \{P^*_1, \ldots, P^*_N\}$ be a polygonal tiling of a simply connected region in the plane, and let $T = \{P_1, \ldots, P_N\}$ be a noisy version of $T^*$ obtained by making small perturbations to the coordinates of the vertices of the polygons in $T^*$. In general, $T$ will only be an approximate tiling, due to the presence of gaps and overlaps between the perturbed polygons in $T$. The areas of these gaps and overlaps are typically small relative to the areas of the polygons themselves. Suppose that we are given the approximate tiling $T$ and we wish to recover the tiling $T^*$. To address this problem, we introduce a new algorithm, called {\tt smart\_repair}, to modify the polygons in $T$ to produce a tiling $\widetilde{T} = \{\widetilde{P}_1, \ldots, \widetilde{P}_N\}$ that closely approximates $T^*$, with special attention given to reproducing the {\em adjacency relations} between the polygons in $T^*$ as closely as possible. The motivation for this algorithm comes from computational redistricting, where algorithms are used to build districts from smaller geographic units. Because districts in most U.S. states are required to be contiguous, these algorithms are fundamentally based on adjacency relations between units. Unfortunately, the best available map data for unit boundaries is often noisy, containing gaps and overlaps between units that can lead to substantial inaccuracies in the adjacency relations. Simple repair algorithms can exacerbate these inaccuracies, with the result that algorithmically drawn districts based on the ``repaired" units may be discontiguous, and hence not legally compliant. The algorithm presented here is specifically designed to avoid such problems. A Python implementation is publicly available as part of the MGGG Redistricting Lab's {\tt Maup} package, available at https://github.com/mggg/maup.
