Data-driven assessment of optimal spatiotemporal resolutions for information extraction in noisy time series data
Domiziano Doria, Simone Martino, Matteo Becchi, Giovanni M. Pavan
TL;DR
An unsupervised approach that allows learning the characteristic length scales of the dominant key events/processes and the optimal spatiotemporal resolutions to characterize them, which proves to be related to the characteristic spatiotemporal length scales of the local/collective physical events dominating it.
Abstract
In general, comprehension of any type of complex system depends on the resolution used to examine the phenomena occurring within it. However, identifying a priori, for example, the best time frequencies/scales to study a certain system over-time, or the spatial distances at which correlations, symmetries, and fluctuations are, most often non-trivial. Here we describe an unsupervised approach that, starting solely from the data of a system, allows learning the characteristic length scales of the dominant key events/processes and the optimal spatiotemporal resolutions to characterize them. We tested this approach on time series data obtained from simulation or experimental trajectories of various example many-body complex systems ranging from the atomic to the macroscopic scale and having diverse internal dynamic complexities. Our method automatically analyzes the system data by analyzing correlations at all relevant inter-particle distances and at all possible inter-frame intervals in which their time series can be subdivided, namely, at all space and time resolutions. The optimal spatiotemporal resolution for studying a certain system thus maximizes information extraction and classification from the system's data, which we prove to be related to the characteristic spatiotemporal length scales of the local/collective physical events dominating it. This approach is broadly applicable and can be used to optimize the study of different types of data (static distributions, time series, or signals). The concept of 'optimal resolution' has a general character and provides a robust basis for characterizing any type of system based on its data, as well as to guide data analysis in general.
