Table of Contents
Fetching ...

Deep Dynamic Probabilistic Canonical Correlation Analysis

Shiqin Tang, Shujian Yu, Yining Dong, S. Joe Qin

TL;DR

The paper tackles the challenge of extracting nonlinear latent dynamics from sequential data by extending probabilistic Canonical Correlation Analysis into a deep dynamic framework. It introduces $D^2PCCA$, a nonlinear, deep-time model that retains a probabilistic structure and leverages amortized variational inference with optional KL annealing and normalizing flows to learn rich posterior representations. Empirical results on finance data show substantial ELBO gains over the linear DPCCA baseline, with KL annealing and autoregressive flows providing the strongest posterior approximations, while reconstruction error remains dataset-dependent. The approach supports multiple observed variables and can encode prior temporal knowledge, offering a versatile tool for analyzing complex sequential systems.

Abstract

This paper presents Deep Dynamic Probabilistic Canonical Correlation Analysis (D2PCCA), a model that integrates deep learning with probabilistic modeling to analyze nonlinear dynamical systems. Building on the probabilistic extensions of Canonical Correlation Analysis (CCA), D2PCCA captures nonlinear latent dynamics and supports enhancements such as KL annealing for improved convergence and normalizing flows for a more flexible posterior approximation. D2PCCA naturally extends to multiple observed variables, making it a versatile tool for encoding prior knowledge about sequential datasets and providing a probabilistic understanding of the system's dynamics. Experimental validation on real financial datasets demonstrates the effectiveness of D2PCCA and its extensions in capturing latent dynamics.

Deep Dynamic Probabilistic Canonical Correlation Analysis

TL;DR

The paper tackles the challenge of extracting nonlinear latent dynamics from sequential data by extending probabilistic Canonical Correlation Analysis into a deep dynamic framework. It introduces , a nonlinear, deep-time model that retains a probabilistic structure and leverages amortized variational inference with optional KL annealing and normalizing flows to learn rich posterior representations. Empirical results on finance data show substantial ELBO gains over the linear DPCCA baseline, with KL annealing and autoregressive flows providing the strongest posterior approximations, while reconstruction error remains dataset-dependent. The approach supports multiple observed variables and can encode prior temporal knowledge, offering a versatile tool for analyzing complex sequential systems.

Abstract

This paper presents Deep Dynamic Probabilistic Canonical Correlation Analysis (D2PCCA), a model that integrates deep learning with probabilistic modeling to analyze nonlinear dynamical systems. Building on the probabilistic extensions of Canonical Correlation Analysis (CCA), D2PCCA captures nonlinear latent dynamics and supports enhancements such as KL annealing for improved convergence and normalizing flows for a more flexible posterior approximation. D2PCCA naturally extends to multiple observed variables, making it a versatile tool for encoding prior knowledge about sequential datasets and providing a probabilistic understanding of the system's dynamics. Experimental validation on real financial datasets demonstrates the effectiveness of D2PCCA and its extensions in capturing latent dynamics.

Paper Structure

This paper contains 6 sections, 22 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Graphical models for DPCCA and D$^2$PCCA. The shaded nodes denote observed variables, while the unshaded ones denote latent variables. The arrows represent transition and emission models, and arrows with solid squares denote the usage of neural networks.
  • Figure 2: Graphical representations for (a) Multiset DPCCA, (b) DPPLS, and (c) Factorial HMM.
  • Figure 3: Convergence of ELBO during (left) training and (right) testing.
  • Figure 4: Comparison of model predictions against the ground truth over time. The shaded region denotes the confidence interval, $\mu \pm 1.96 \sigma$.