Table of Contents
Fetching ...

Matrix Completion via Nonsmooth Regularization of Fully Connected Neural Networks

Sajad Faramarzi, Farzan Haddadi, Sajjad Amini, Masoud Ahookhosh

TL;DR

This paper regularizes the FCNN model in terms of the $\ell_{1}$ norm of intermediate representations and nuclear norm of weight matrices, and proposes a variant of the proximal gradient method that indicates the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.

Abstract

Conventional matrix completion methods approximate the missing values by assuming the matrix to be low-rank, which leads to a linear approximation of missing values. It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks. Deep fully connected neural networks (FCNNs), one of the most suitable architectures for matrix completion, suffer from over-fitting due to their high capacity, which leads to low generalizability. In this paper, we control over-fitting by regularizing the FCNN model in terms of the $\ell_{1}$ norm of intermediate representations and nuclear norm of weight matrices. As such, the resulting regularized objective function becomes nonsmooth and nonconvex, i.e., existing gradient-based methods cannot be applied to our model. We propose a variant of the proximal gradient method and investigate its convergence to a critical point. In the initial epochs of FCNN training, the regularization terms are ignored, and through epochs, the effect of that increases. The gradual addition of nonsmooth regularization terms is the main reason for the better performance of the deep neural network with nonsmooth regularization terms (DNN-NSR) algorithm. Our simulations indicate the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.

Matrix Completion via Nonsmooth Regularization of Fully Connected Neural Networks

TL;DR

This paper regularizes the FCNN model in terms of the norm of intermediate representations and nuclear norm of weight matrices, and proposes a variant of the proximal gradient method that indicates the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.

Abstract

Conventional matrix completion methods approximate the missing values by assuming the matrix to be low-rank, which leads to a linear approximation of missing values. It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks. Deep fully connected neural networks (FCNNs), one of the most suitable architectures for matrix completion, suffer from over-fitting due to their high capacity, which leads to low generalizability. In this paper, we control over-fitting by regularizing the FCNN model in terms of the norm of intermediate representations and nuclear norm of weight matrices. As such, the resulting regularized objective function becomes nonsmooth and nonconvex, i.e., existing gradient-based methods cannot be applied to our model. We propose a variant of the proximal gradient method and investigate its convergence to a critical point. In the initial epochs of FCNN training, the regularization terms are ignored, and through epochs, the effect of that increases. The gradual addition of nonsmooth regularization terms is the main reason for the better performance of the deep neural network with nonsmooth regularization terms (DNN-NSR) algorithm. Our simulations indicate the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.
Paper Structure (12 sections, 14 theorems, 81 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 12 sections, 14 theorems, 81 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Lemma 1

(Proximal operator for $\ell_{1}$ norm amini2018): The proximal operator of $g({\bf x}) = \alpha \lVert {\bf x} \rVert_{1}$ is the soft-thresholding function defined as.

Figures (3)

  • Figure 1: RGB image for the experiment ( Image $\mathrm{I}$) .
  • Figure 2: Masked and inpainted results for Image $\mathrm{I}$ in Fig. \ref{['images']} with $50\%$ random masked pixels, (a): masked image, (b) inpainted image.
  • Figure 3: Variation of loss function for (a) $300 \times 200$ synthetic matrix with $\rho = 80\%$ and (b) Image $\mathrm{I}$ in Fig. \ref{['images']} with $\rho = 50\%$.

Theorems & Definitions (26)

  • Definition 1
  • Lemma 1
  • Definition 2
  • Lemma 2
  • Lemma 3: Proximal operator for $\ell_\infty$ norm amini2019
  • Definition 3
  • Lemma 4
  • proof
  • Lemma 5
  • Lemma 6
  • ...and 16 more