DTR: A Unified Deep Tensor Representation Framework for Multimedia Data Recovery
Ting-Wei Zhou, Xi-Le Zhao, Jian-Li Wang, Yi-Si Luo, Min Wang, Xiao-Xuan Bai, Hong Yan
TL;DR
This paper introduces DTR, a unified deep tensor representation that couples a deep latent generative module $g_\theta(\cdot)$ with a deep transform module $f_\xi(\cdot)$ to advance multimedia data recovery. By modeling both intra- and inter-slice relations through an untrained neural generator and capturing frontal-slice dependencies via an untrained transform, DTR surpasses traditional shallow characterizations and transform-based methods. The authors formulate an unsupervised recovery objective and validate performance across HSIs, MSIs, and videos, showing significant improvements in PSNR/SSIM and visual detail preservation. The work provides practical insights into network choices, layer depths, and latent/tensor sizes, highlighting the practical value of combining deep latent generation with nonlinear transform in tensor-based recovery tasks.
Abstract
Recently, the transform-based tensor representation has attracted increasing attention in multimedia data (e.g., images and videos) recovery problems, which consists of two indispensable components, i.e., transform and characterization. Previously, the development of transform-based tensor representation mainly focuses on the transform aspect. Although several attempts consider using shallow matrix factorization (e.g., singular value decomposition and negative matrix factorization) to characterize the frontal slices of transformed tensor (termed as latent tensor), the faithful characterization aspect is underexplored. To address this issue, we propose a unified Deep Tensor Representation (termed as DTR) framework by synergistically combining the deep latent generative module and the deep transform module. Especially, the deep latent generative module can faithfully generate the latent tensor as compared with shallow matrix factorization. The new DTR framework not only allows us to better understand the classic shallow representations, but also leads us to explore new representation. To examine the representation ability of the proposed DTR, we consider the representative multi-dimensional data recovery task and suggest an unsupervised DTR-based multi-dimensional data recovery model. Extensive experiments demonstrate that DTR achieves superior performance compared to state-of-the-art methods in both quantitative and qualitative aspects, especially for fine details recovery.
