Table of Contents
Fetching ...

Unsupervised learning of anomalous diffusion data

Gorka Muñoz-Gil, Guillem Guigó i Corominas, Maciej Lewenstein

TL;DR

This work tackles the challenge of characterizing anomalous diffusion from short, noisy single trajectories by proposing an unsupervised framework based on convolutional autoencoders. By using reconstruction error as an anomaly score, the method can discriminate between diffusion models, infer the anomalous exponent $α$, and identify composite diffusion dynamics without labelled data. The approach is validated on simulated trajectories across several diffusion models and extended to experimental single-particle trajectories, including very short ones, demonstrating robust, label-free insights where supervised methods struggle. This data-driven, model-agnostic framework offers a practical path to uncover diffusion mechanisms in complex systems and can guide the development of more accurate physical models.

Abstract

The characterization of diffusion processes is a keystone in our understanding of a variety of physical phenomena. Many of these deviate from Brownian motion, giving rise to anomalous diffusion. Various theoretical models exists nowadays to describe such processes, but their application to experimental setups is often challenging, due to the stochastic nature of the phenomena and the difficulty to harness reliable data. The latter often consists on short and noisy trajectories, which are hard to characterize with usual statistical approaches. In recent years, we have witnessed an impressive effort to bridge theory and experiments by means of supervised machine learning techniques, with astonishing results. In this work, we explore the use of unsupervised methods in anomalous diffusion data. We show that the main diffusion characteristics can be learnt without the need of any labelling of the data. We use such method to discriminate between anomalous diffusion models and extract their physical parameters. Moreover, we explore the feasibility of finding novel types of diffusion, in this case represented by compositions of existing diffusion models. At last, we showcase the use of the method in experimental data and demonstrate its advantages for cases where supervised learning is not applicable.

Unsupervised learning of anomalous diffusion data

TL;DR

This work tackles the challenge of characterizing anomalous diffusion from short, noisy single trajectories by proposing an unsupervised framework based on convolutional autoencoders. By using reconstruction error as an anomaly score, the method can discriminate between diffusion models, infer the anomalous exponent , and identify composite diffusion dynamics without labelled data. The approach is validated on simulated trajectories across several diffusion models and extended to experimental single-particle trajectories, including very short ones, demonstrating robust, label-free insights where supervised methods struggle. This data-driven, model-agnostic framework offers a practical path to uncover diffusion mechanisms in complex systems and can guide the development of more accurate physical models.

Abstract

The characterization of diffusion processes is a keystone in our understanding of a variety of physical phenomena. Many of these deviate from Brownian motion, giving rise to anomalous diffusion. Various theoretical models exists nowadays to describe such processes, but their application to experimental setups is often challenging, due to the stochastic nature of the phenomena and the difficulty to harness reliable data. The latter often consists on short and noisy trajectories, which are hard to characterize with usual statistical approaches. In recent years, we have witnessed an impressive effort to bridge theory and experiments by means of supervised machine learning techniques, with astonishing results. In this work, we explore the use of unsupervised methods in anomalous diffusion data. We show that the main diffusion characteristics can be learnt without the need of any labelling of the data. We use such method to discriminate between anomalous diffusion models and extract their physical parameters. Moreover, we explore the feasibility of finding novel types of diffusion, in this case represented by compositions of existing diffusion models. At last, we showcase the use of the method in experimental data and demonstrate its advantages for cases where supervised learning is not applicable.

Paper Structure

This paper contains 12 sections, 3 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: (a) Representative two dimensional trajectories of the diffusion models considered in this work. All trajectories have anomalous exponent $\alpha = 0.7$. (b) Schematic representation of the autoencoders used, showcasing some of the layers composing them, as well as the skip connections feature. (c) Input and output of an autoencoder trained with FBM at $\alpha=0.5$ (highlighted in gray), predicting trajectories of ATTM, FBM and SBM with $\alpha=0.5$, respectively. (d) Input and output of the same autoencoder, now with trajectories of FBM at three different exponents, $\alpha = [0.3, 0.5, 0.9]$. In both (c-d), the best reconstruction (lowest MSE) occurs for trajectories of the model/exponent with which the autoencoder was trained.
  • Figure 2: (a) MSE vs. the diffusion model of the input trajectories. Each line corresponds to an AE trained with the indicated model. (b-e) MSE vs. anomalous exponent $\alpha$. Each panel corresponds to an AE trained with the indicated diffusion model, while the color of the lines indicate the model of the input trajectories, following the legend of (a).
  • Figure 3: MSE vs. anomalous exponent $\alpha$ for AEs trained with FBM (left) or SBM (right) trajectories in different ranges of $\alpha$ (as indicated by the shaded regions): (a-b) $\alpha \in [0.1, 2)$; (c-d) $\alpha \in [0.1, 1]$; (e-f) $\alpha \in [1, 2)$
  • Figure 4: (a-c) Fitted anomalous exponent $\alpha_f$ of the reconstructed trajectories vs. the ground-truth exponent $\alpha$ of the input trajectory. Gray line corresponds to $\alpha_f = \alpha$. (d-f) MSE vs. anomalous exponent $\alpha$. For all plots, we consider AEs trained with FBM (blue) or ATTM (orange). Each row corresponds to AEs trained at different $\alpha$, indicated in the title and the vertical dashed line.
  • Figure 5: (a) Schematic representation of the diffusion models governing a trajectory with certain $\gamma$ and $\beta$ as given by Eq. \ref{['eq:composite']}. (c-d) MSE as a function of the trajectories $\gamma$ and $\beta$ for AE trained with CTRW, ATTM, and SBM trajectories in the subdiffusive range $\alpha\in[0.1,1]$ and $T=20$. Each point averages the results of 10 AE over 5000 trajectories.
  • ...and 2 more figures