Subsampling, aligning, and averaging to find circular coordinates in recurrent time series
Andrew J. Blumberg, Mathieu Carrière, Jun Hou Fung, Michael A. Mandell
TL;DR
The paper introduces a robust three-stage pipeline for extracting circular coordinates from recurrent time-series data: density uniformization via rejection sampling to mitigate uneven sampling, persistent cohomology to obtain circular coordinates on subsamples, and alignment-averaging through Procrustes analysis to produce a global coordinate. This approach yields coordinates that are more robust to noise and outliers and substantially more efficient than applying persistent cohomology to the full dataset. Validation on synthetic data and C. elegans neuronal recordings reveals a topological model of worm brain-state trajectories, where loops map to interpretable behavioral states, with mutual information indicating higher informativeness for the corrected coordinates. The method thus enables scalable, unsupervised discovery of circular, recurrent structure in high-dimensional neural time series and can be extended to broader dynamical-system contexts.
Abstract
We introduce a new algorithm for finding robust circular coordinates on data that is expected to exhibit recurrence, such as that which appears in neuronal recordings of C. elegans. Techniques exist to create circular coordinates on a simplicial complex from a dimension 1 cohomology class, and these can be applied to the Rips complex of a dataset when it has a prominent class in its dimension 1 cohomology. However, it is known this approach is extremely sensitive to uneven sampling density. Our algorithm comes with a new method to correct for uneven sampling density, adapting our prior work on averaging coordinates in manifold learning. We use rejection sampling to correct for inhomogeneous sampling and then apply Procrustes matching to align and average the subsamples. In addition to providing a more robust coordinate than other approaches, this subsampling and averaging approach has better efficiency. We validate our technique on both synthetic data sets and neuronal activity recordings. Our results reveal a topological model of neuronal trajectories for C. elegans that is constructed from loops in which different regions of the brain state space can be mapped to specific and interpretable macroscopic behaviors in the worm.
