On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples

Cheongjae Jang; Jonghyun Won; Soyeon Jun; Chun Kee Chung; Keehyoung Joo; Yung-Kyun Noh

On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples

Cheongjae Jang, Jonghyun Won, Soyeon Jun, Chun Kee Chung, Keehyoung Joo, Yung-Kyun Noh

TL;DR

The paper analyzes how the one-dimensional Wasserstein distance $W_1$ between finite samples encodes pointwise density differences (rates) and support changes. Using Poisson processes, it derives analytic expressions for expected spike distances that reveal rate-difference encoding and the integration of rate and shift information, with asymptotic behavior clarified as sample size grows. The authors validate these insights through synthetic data and real-world neural spike-train and amino-acid contact datasets, showing that Wasserstein-based features improve classification and representation tasks and offer complementary perspectives to KL-based measures. Overall, the work provides a rigorous finite-sample interpretation of $W_1$ as a mixture of rate and support information, with practical implications for neuroscience and molecular biology and potential extensions to sliced Wasserstein distances.

Abstract

Leveraging the Wasserstein distance -- a summation of sample-wise transport distances in data space -- is advantageous in many applications for measuring support differences between two underlying density functions. However, when supports significantly overlap while densities exhibit substantial pointwise differences, it remains unclear whether and how this transport information can accurately identify these differences, particularly their analytic characterization in finite-sample settings. We address this issue by conducting an analysis of the information processing capabilities of the one-dimensional Wasserstein distance with finite samples. By utilizing the Poisson process and isolating the rate factor, we demonstrate the capability of capturing the pointwise density difference with Wasserstein distances and how this information harmonizes with support differences. The analyzed properties are confirmed using neural spike train decoding and amino acid contact frequency data. The results reveal that the one-dimensional Wasserstein distance highlights meaningful density differences related to both rate and support.

On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples

TL;DR

The paper analyzes how the one-dimensional Wasserstein distance

between finite samples encodes pointwise density differences (rates) and support changes. Using Poisson processes, it derives analytic expressions for expected spike distances that reveal rate-difference encoding and the integration of rate and shift information, with asymptotic behavior clarified as sample size grows. The authors validate these insights through synthetic data and real-world neural spike-train and amino-acid contact datasets, showing that Wasserstein-based features improve classification and representation tasks and offer complementary perspectives to KL-based measures. Overall, the work provides a rigorous finite-sample interpretation of

as a mixture of rate and support information, with practical implications for neuroscience and molecular biology and potential extensions to sliced Wasserstein distances.

On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples

TL;DR

Abstract

On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (2)