Table of Contents
Fetching ...

Personalized Coupled Tensor Decomposition for Multimodal Data Fusion: Uniqueness and Algorithms

Ricardo Augusto Borsoi, Konstantin Usevich, David Brie, Tülay Adali

TL;DR

This work introduces a personalized coupled tensor decomposition (CTD) framework for multimodal data fusion, where each observed tensor is the sum of a common component linked to a shared tensor via a multilinear measurement model and a dataset-specific distinct component, with both components admitting CPDs. The authors establish both deterministic and generic uniqueness results leveraging uni-mode uniqueness and a flexible multilinear degradation model, and propose two computational strategies: a semi-algebraic method for principled initialization and an ALS-based optimization method that handles noise and flexible couplings. Experimental results on synthetic and real hyperspectral imaging data show the approach achieves accurate recovery of shared and dataset-specific factors and outperforms state-of-the-art CTD methods, particularly under inter-image variability and higher cloud contamination, with favorable computation times. The framework generalizes several existing CTD models and provides a principled basis for subsequent work on rank estimation and extensions to higher-order tensors.

Abstract

Coupled tensor decompositions (CTDs) perform data fusion by linking factors from different datasets. Although many CTDs have been already proposed, current works do not address important challenges of data fusion, where: 1) the datasets are often heterogeneous, constituting different "views" of a given phenomena (multimodality); and 2) each dataset can contain personalized or dataset-specific information, constituting distinct factors that are not coupled with other datasets. In this work, we introduce a personalized CTD framework tackling these challenges. A flexible model is proposed where each dataset is represented as the sum of two components, one related to a common tensor through a multilinear measurement model, and another specific to each dataset. Both the common and distinct components are assumed to admit a polyadic decomposition. This generalizes several existing CTD models. We provide conditions for specific and generic uniqueness of the decomposition that are easy to interpret. These conditions employ uni-mode uniqueness of different individual datasets and properties of the measurement model. Two algorithms are proposed to compute the common and distinct components: a semi-algebraic one and a coordinate-descent optimization method. Experimental results illustrate the advantage of the proposed framework compared with the state of the art approaches.

Personalized Coupled Tensor Decomposition for Multimodal Data Fusion: Uniqueness and Algorithms

TL;DR

This work introduces a personalized coupled tensor decomposition (CTD) framework for multimodal data fusion, where each observed tensor is the sum of a common component linked to a shared tensor via a multilinear measurement model and a dataset-specific distinct component, with both components admitting CPDs. The authors establish both deterministic and generic uniqueness results leveraging uni-mode uniqueness and a flexible multilinear degradation model, and propose two computational strategies: a semi-algebraic method for principled initialization and an ALS-based optimization method that handles noise and flexible couplings. Experimental results on synthetic and real hyperspectral imaging data show the approach achieves accurate recovery of shared and dataset-specific factors and outperforms state-of-the-art CTD methods, particularly under inter-image variability and higher cloud contamination, with favorable computation times. The framework generalizes several existing CTD models and provides a principled basis for subsequent work on rank estimation and extensions to higher-order tensors.

Abstract

Coupled tensor decompositions (CTDs) perform data fusion by linking factors from different datasets. Although many CTDs have been already proposed, current works do not address important challenges of data fusion, where: 1) the datasets are often heterogeneous, constituting different "views" of a given phenomena (multimodality); and 2) each dataset can contain personalized or dataset-specific information, constituting distinct factors that are not coupled with other datasets. In this work, we introduce a personalized CTD framework tackling these challenges. A flexible model is proposed where each dataset is represented as the sum of two components, one related to a common tensor through a multilinear measurement model, and another specific to each dataset. Both the common and distinct components are assumed to admit a polyadic decomposition. This generalizes several existing CTD models. We provide conditions for specific and generic uniqueness of the decomposition that are easy to interpret. These conditions employ uni-mode uniqueness of different individual datasets and properties of the measurement model. Two algorithms are proposed to compute the common and distinct components: a semi-algebraic one and a coordinate-descent optimization method. Experimental results illustrate the advantage of the proposed framework compared with the state of the art approaches.

Paper Structure

This paper contains 21 sections, 7 theorems, 51 equations, 6 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

kruskal1977uniquenessTensorstegeman2007kruskalConditionAcessible The CPD of an order-3 tensor $\boldsymbol{\mathcal{X}} =\ldbrack\boldsymbol{A},\boldsymbol{B},\boldsymbol{C} \rdbrack$ with rank $R$ is essentially unique if

Figures (6)

  • Figure 1: Illustration of the model in \ref{['eq:meas_model']}. A latent tensor $\boldsymbol{\mathcal{C}}$ is common in all measurements and acquired through an operator $\mathscr{P}_k$ while tensors$\boldsymbol{\mathcal{D}} _k$ are distinct to each measurement, leading to a personalized decomposition.
  • Figure 2: Illustration of the multilinear measurement model in \ref{['eq:degrad_model']}.
  • Figure 3: NRMSE for different SNRs for the example with synthetic data.
  • Figure 4: NRMSE (between $\boldsymbol{\mathcal{C}} _k(\alpha)$ and $\widehat{ \boldsymbol{\mathcal{C}} }$) of the solutions from the optimization algorithm (init. 2) as a function of $\alpha$.
  • Figure 5: NRMSE of the solution given by the proposed optimization-based algorithm (init. 2) as a function of the ranks of the decomposition $R$ and $L_k$. The true ranks of the tensors are given by $R^{\rm true}=5$ and $L_k^{\rm true}=5$.
  • ...and 1 more figures

Theorems & Definitions (24)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • Theorem 2
  • Example 1
  • Remark 1
  • Example 2
  • Example 3
  • ...and 14 more