Interpretable Deep Learning Methods for Multiview Learning

Hengkang Wang; Han Lu; Ju Sun; Sandra E Safo

Interpretable Deep Learning Methods for Multiview Learning

Hengkang Wang, Han Lu, Ju Sun, Sandra E Safo

TL;DR

iDeepViewLearn is an innovative deep learning model capable of capturing nonlinear relationships between data from multiple views while achieving feature selection.

Abstract

Technological advances have enabled the generation of unique and complementary types of data or views (e.g. genomics, proteomics, metabolomics) and opened up a new era in multiview learning research with the potential to lead to new biomedical discoveries. We propose iDeepViewLearn (Interpretable Deep Learning Method for Multiview Learning) for learning nonlinear relationships in data from multiple views while achieving feature selection. iDeepViewLearn combines deep learning flexibility with the statistical benefits of data and knowledge-driven feature selection, giving interpretable results. Deep neural networks are used to learn view-independent low-dimensional embedding through an optimization problem that minimizes the difference between observed and reconstructed data, while imposing a regularization penalty on the reconstructed data. The normalized Laplacian of a graph is used to model bilateral relationships between variables in each view, therefore, encouraging selection of related variables. iDeepViewLearn is tested on simulated and two real-world data, including breast cancer-related gene expression and methylation data. iDeepViewLearn had competitive classification results and identified genes and CpG sites that differentiated between individuals who died from breast cancer and those who did not. The results of our real data application and simulations with small to moderate sample sizes suggest that iDeepViewLearn may be a useful method for small-sample-size problems compared to other deep learning methods for multiview learning.

Interpretable Deep Learning Methods for Multiview Learning

TL;DR

iDeepViewLearn is an innovative deep learning model capable of capturing nonlinear relationships between data from multiple views while achieving feature selection.

Abstract

Paper Structure (21 sections, 6 equations, 10 figures, 13 tables)

This paper contains 21 sections, 6 equations, 10 figures, 13 tables.

Background
Existing Methods
Our Approach
Methods
Model Formulation
Network-based feature selection
Prediction of shared low-dimensional representation and downstream analyses
Simulation Experiments
Set-up when there is no prior information on variable-variable interactions
Nonlinear Simulations
Competing Methods and Results
Set-up when there is prior information on variable-variable interactions
Competing Methods and Results
Real-World Experiments
Evaluation of Data from Holm Breast Cancer Study
...and 6 more sections

Figures (10)

Figure 1: Feature Selection. We train a deep learning model that takes all the views, estimates a shared low-dimensional representation ${\bf Z}$ that drives the variation across the views, and obtains nonlinear reconstructions ($G_1({\bf Z})$,…,$G_D({\bf Z})$) of the original views. We impose sparsity constraints on the reconstructions allowing us to identify a subset of variables for each view ($\mathbf I_1$, …,$\mathbf I_D$) that approximate the original data.
Figure 2: Reconstruction and Downstream Analysis. We train a deep learning model to obtain a common low-dimensional representation ${\bf Z}'$ that is based on the features selected in Algorithm 1, we obtain nonlinear approximations ($\mathbf R_1({\bf Z})$,…,$\mathbf R_D({\bf Z}))$, and we perform downstream analyses using estimated ${\bf Z}'$.
Figure 3: Structure of nonlinear relationships between (First left panel) signal variables in View 1; (Second left panel) signal variables in View 2; (Middle panel)-(Fifth panel) signal variables between Views 1 and 2. Black circle: Class 1; Red triangle: Class 2.
Figure 4: Network structure for the first 50 variables in $\mathbf X^{(1)}$ and $\mathbf X^{(2)}$. Left: scale-free network; Middle: Lattice; Right: Cluster. For the Scale-free network, we consider variable 2 has a hub variable. Variable 2 and the variables directly connected to it are considered as signal variables. For the Lattice network, all variables except variable 50 are considered as signals. For the Cluster network, the circled clusters are considered as signals.
Figure 5: All genes except BIRC5 were consistently selected in the top $20\%$ of highly-ranked genes across the twenty resampled datasets. BIRC5 was selected 19 times (out of 20) in the top $20\%$ highly-ranked genes. Genes PDGFRB and BIRC5 have mean expression levels that are statistically significantly different between individuals that died from breast cancer and those that survived.
...and 5 more figures

Interpretable Deep Learning Methods for Multiview Learning

TL;DR

Abstract

Interpretable Deep Learning Methods for Multiview Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)