An $O(n \log n)$-Time Approximation Scheme for Geometric Many-to-Many Matching
Sayan Bandyapadhyay, Jie Xue
TL;DR
This work establishes a near-linear time $(1+\varepsilon)$-approximation scheme for geometric many-to-many matching on colored point sets in fixed dimension, under any $L_p$ norm. The authors introduce a penalized formulation with per-point penalties $\phi(p)$ based on approximate nearest foreign neighbors, and develop two key reductions that reduce the problem to well-structured subproblems with bounded width using grid shifting and partitioning. A central subroutine solves small, well-structured instances by modeling as a compact integer linear program and applying a fixed-parameter tractable ILP solver, ensuring an overall $(1+\varepsilon)$-approximation in $O_\varepsilon(n \log n)$ time after pre-processing. The scheme leverages Baker's shifting technique, grid techniques, approximate nearest neighbor search, and an ILP-FPT approach, and it generalizes to any fixed $L_p$ norm. This result closes a gap by delivering the first near-linear approximation scheme for geometric many-to-many matching in dimensions $d \ge 2$, with potential extensions to related variants and questions about improving the dependence on $\varepsilon$.
Abstract
Geometric matching is an important topic in computational geometry and has been extensively studied over decades. In this paper, we study a geometric-matching problem, known as geometric many-to-many matching. In this problem, the input is a set $S$ of $n$ colored points in $\mathbb{R}^d$, which implicitly defines a graph $G = (S,E(S))$ where $E(S) = \{(p,q): p,q \in S \text{ have different colors}\}$, and the goal is to compute a minimum-cost subset $E^* \subseteq E(S)$ of edges that cover all points in $S$. Here the cost of $E^*$ is the sum of the costs of all edges in $E^*$, where the cost of a single edge $e$ is the Euclidean distance (or more generally, the $L_p$-distance) between the two endpoints of $e$. Our main result is a $(1+\varepsilon)$-approximation algorithm with an optimal running time $O_\varepsilon(n \log n)$ for geometric many-to-many matching in any fixed dimension, which works under any $L_p$-norm. This is the first near-linear approximation scheme for the problem in any $d \geq 2$. Prior to this work, only the bipartite case of geometric many-to-many matching was considered in $\mathbb{R}^1$ and $\mathbb{R}^2$, and the best known approximation scheme in $\mathbb{R}^2$ takes $O_\varepsilon(n^{1.5} \cdot \mathsf{poly}(\log n))$ time.
