Table of Contents
Fetching ...

Simultaneous Dimensionality Reduction for Extracting Useful Representations of Large Empirical Multimodal Datasets

Eslam Abdelaleem

TL;DR

This Dissertation addresses the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within neural systems or high-dimensional dynamical systems, and unifies diverse reduction methods under a comprehensive framework, the Deep Variational Multivariate Information Bottleneck.

Abstract

The quest for simplification in physics drives the exploration of concise mathematical representations for complex systems. This Dissertation focuses on the concept of dimensionality reduction as a means to obtain low-dimensional descriptions from high-dimensional data, facilitating comprehension and analysis. We address the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within neural systems or high-dimensional dynamical systems. Leveraging insights from both theoretical physics and machine learning, this work unifies diverse reduction methods under a comprehensive framework, the Deep Variational Multivariate Information Bottleneck. This framework enables the design of tailored reduction algorithms based on specific research questions. We explore and assert the efficacy of simultaneous reduction approaches over their independent reduction counterparts, demonstrating their superiority in capturing covariation between multiple modalities, while requiring less data. We also introduced novel techniques, such as the Deep Variational Symmetric Information Bottleneck, for general nonlinear simultaneous reduction. We show that the same principle of simultaneous reduction is the key to efficient estimation of mutual information. We show that our new method is able to discover the coordinates of high-dimensional observations of dynamical systems. Through analytical investigations and empirical validations, we shed light on the intricacies of dimensionality reduction methods, paving the way for enhanced data analysis across various domains. We underscore the potential of these methodologies to extract meaningful insights from complex datasets, driving advancements in fundamental research and applied sciences. As these methods evolve, they promise to deepen our understanding of complex systems and inform more effective data analysis strategies.

Simultaneous Dimensionality Reduction for Extracting Useful Representations of Large Empirical Multimodal Datasets

TL;DR

This Dissertation addresses the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within neural systems or high-dimensional dynamical systems, and unifies diverse reduction methods under a comprehensive framework, the Deep Variational Multivariate Information Bottleneck.

Abstract

The quest for simplification in physics drives the exploration of concise mathematical representations for complex systems. This Dissertation focuses on the concept of dimensionality reduction as a means to obtain low-dimensional descriptions from high-dimensional data, facilitating comprehension and analysis. We address the challenges posed by real-world data that defy conventional assumptions, such as complex interactions within neural systems or high-dimensional dynamical systems. Leveraging insights from both theoretical physics and machine learning, this work unifies diverse reduction methods under a comprehensive framework, the Deep Variational Multivariate Information Bottleneck. This framework enables the design of tailored reduction algorithms based on specific research questions. We explore and assert the efficacy of simultaneous reduction approaches over their independent reduction counterparts, demonstrating their superiority in capturing covariation between multiple modalities, while requiring less data. We also introduced novel techniques, such as the Deep Variational Symmetric Information Bottleneck, for general nonlinear simultaneous reduction. We show that the same principle of simultaneous reduction is the key to efficient estimation of mutual information. We show that our new method is able to discover the coordinates of high-dimensional observations of dynamical systems. Through analytical investigations and empirical validations, we shed light on the intricacies of dimensionality reduction methods, paving the way for enhanced data analysis across various domains. We underscore the potential of these methodologies to extract meaningful insights from complex datasets, driving advancements in fundamental research and applied sciences. As these methods evolve, they promise to deepen our understanding of complex systems and inform more effective data analysis strategies.

Paper Structure

This paper contains 109 sections, 52 equations, 44 figures, 5 tables.

Figures (44)

  • Figure 1: The resulting correlations are averages of all the points in the phasespace, then averaged over 10 different realizations of the matrices. The error bars are for two standard deviations around the mean
  • Figure 2: Performance of PCA, PLS, CCA, rCCA, and noise in recovery of the shared signal for $|Z_X| = |Z_Y| = 1 = m_\text{self}$. PCA struggles to detect shared signals when they are weaker than the self signals. PLS and rCCA demonstrate nearly perfect reconstruction. CCA displays no reconstruction in the undersampled regime $T\ll N_X$, and it is nearly perfect for large $T$.
  • Figure 3: Same as Fig. \ref{['pd1-mx1-zx1']}, but for $|Z_X| = |Z_Y| = 2 = m_\text{self} + m_\text{shared}$. Now there are enough compressed variables for PCA to detect the shared signal. Other methods perform similarly to Fig. \ref{['pd1-mx1-zx1']}, albeit the noise is larger.
  • Figure 4: Reconstruction results for $m_{\rm self}=30$, $m_{\rm shared}=1$, and $|Z_X| = |Z_Y| = 1$. PCA struggles to detect any shared signals when they are even comparable to the self ones. PLS performance also degrades. CCA displays its usual impotence at small $T$. Finally, rCCA demonstrates nearly perfect reconstruction for all parameter values.
  • Figure 5: DR performance for $|Z_X| = |Z_Y| = m_\text{self} > m_\text{shared}$). PCA now detects shared signals even when they are weaker than the self signals. However, the quality of reconstruction is significantly lower than in Fig. \ref{['pd1-mx1-zx2']}. PLS detects signals in a larger part of the phase space, but also with a significant reduction in quality, which improves with sampling. CCA has its usual problem for $T\ll N_X$, and, like PLS, it has a significantly lower reconstruction quality than in the regime in Fig. \ref{['pd1-mx30-zx1']}. rCCA is able to detect the signal in the whole phase space, but again with worse quality. Finally, spurious correlations are high, though they decrease with better sampling.
  • ...and 39 more figures