On the representation and learning of monotone triangular transport maps
Ricardo Baptista, Youssef Marzouk, Olivier Zahm
TL;DR
This work develops a rectification-based, semi-parametric framework for representing and learning monotone triangular transport maps, rooted in Knothe–Rosenblatt rearrangements. By transforming non-monotone function components via a bijective rectification operator, the authors convert the original constrained learning problem into an unconstrained one with a differentiable objective and, under suitable tail conditions, establish that local minima are global and the KR map is the unique global minimizer. They propose an adaptive algorithm (ATM) that builds sparse, interpretable map representations using polynomial or wavelet bases and cross-validation to adapt model complexity to data size, enabling effective density estimation, conditional density estimation, and structure learning of DAGs. The method demonstrates strong empirical performance across one- and two-dimensional targets, stochastic volatility, and tabular datasets, while uncovering conditional independence and sparsity patterns in learned maps. Overall, the rectification framework provides a principled, tractable path to learning informative, sparse transport maps with theoretical guarantees and practical scalability for complex distributions.
Abstract
Transportation of measure provides a versatile approach for modeling complex probability distributions, with applications in density estimation, Bayesian inference, generative modeling, and beyond. Monotone triangular transport maps$\unicode{x2014}$approximations of the Knothe$\unicode{x2013}$Rosenblatt (KR) rearrangement$\unicode{x2014}$are a canonical choice for these tasks. Yet the representation and parameterization of such maps have a significant impact on their generality and expressiveness, and on properties of the optimization problem that arises in learning a map from data (e.g., via maximum likelihood estimation). We present a general framework for representing monotone triangular maps via invertible transformations of smooth functions. We establish conditions on the transformation such that the associated infinite-dimensional minimization problem has no spurious local minima, i.e., all local minima are global minima; and we show for target distributions satisfying certain tail conditions that the unique global minimizer corresponds to the KR map. Given a sample from the target, we then propose an adaptive algorithm that estimates a sparse semi-parametric approximation of the underlying KR map. We demonstrate how this framework can be applied to joint and conditional density estimation, likelihood-free inference, and structure learning of directed graphical models, with stable generalization performance across a range of sample sizes.
