Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels

Wenqian Xue; Chi Ding; Jose Principe

Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels

Wenqian Xue, Chi Ding, Jose Principe

TL;DR

This paper proposes a DPCN with a fast inference of internal model variables (states and causes) that achieves high sparsity and accuracy of feature clustering, and outperforms previous versions of DPCNs on learning rate, sparsity ratio, and feature clustering accuracy.

Abstract

Brain-inspired deep predictive coding networks (DPCNs) effectively model and capture video features through a bi-directional information flow, even without labels. They are based on an overcomplete description of video scenes, and one of the bottlenecks has been the lack of effective sparsification techniques to find discriminative and robust dictionaries. FISTA has been the best alternative. This paper proposes a DPCN with a fast inference of internal model variables (states and causes) that achieves high sparsity and accuracy of feature clustering. The proposed unsupervised learning procedure, inspired by adaptive dynamic programming with a majorization-minimization framework, and its convergence are rigorously analyzed. Experiments in the data sets CIFAR-10, Super Mario Bros video game, and Coil-100 validate the approach, which outperforms previous versions of DPCNs on learning rate, sparsity ratio, and feature clustering accuracy. Because of DCPN's solid foundation and explainability, this advance opens the door for general applications in object recognition in video without labels.

Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels

TL;DR

Abstract

Paper Structure (23 sections, 4 theorems, 56 equations, 7 figures, 4 tables, 4 algorithms)

This paper contains 23 sections, 4 theorems, 56 equations, 7 figures, 4 tables, 4 algorithms.

Introduction
Dynamic Networks for DPCNs
Learning For Model Inference and Variable Inference
MM-Based Model Inference
State Inference
Cause Inference
Model Parameters Inference
MM-Based Variable Inference with Top-Down Preference
Convergence Analysis of MM-Based Variable Inference
Convergence of State Inference
Convergence of Causes Inference
Experiments
Comparison on Image Sparse Coding
Comparison on Video Clustering
Super Mario Bors data set
...and 8 more sections

Key Result

Theorem 1

Consider the sequence $\{x_t^{i}\}\in \mathbb{R}^K$ for a patch generated by Algorithm al2. Then, $F(x_t^i)$ converges, and for any $s\geq 1$ we have where $R=\text{diag} \{1/(\tilde{{1}}|(x_t^{0})_k|+(1-\tilde{{1}}-\bar{{1}})|(x_t^{*})_k|+\bar{{1}}|(x_t^{i})_k|)\}$, $k\in \{1,2,..., K\}$, with $\tilde{{1}}=1$ if $|(x^*)_k| \geq |(x_t^{0})_k|>0$, $\tilde{{1}}=0$ if $0 \leq |(x^*)_k| < |(x_t^{0})_

Figures (7)

Figure 1: Two-layered DPCNs structure. The video frame is decomposed into patches (green blocks). Every patch is mapped onto a state $x_t^1$ at layer 1, and the cause $u_t^1$ pool all the states within a group. The cause $u_t^1$ is input of layer 2 and corresponds to state $x_t^2$ and cause $u_t^2$.
Figure 2: (a) Bi-directional inference flow, where feedforward (yellow), feedback (green), and recurrent (pink) connections convey the bottom-up and top-down predictions. (b) Connections for variables inference (solid lines) and for model inference (dash lines).
Figure 3: (a) Convergence of MM Algorithm \ref{['al2']}, ISTA, FISTA, and ADAM, (b) sparsity level using MM Algorithm \ref{['al2']}, and (c) sparsity level using FISTA.
Figure 4: Clustering result for a Super Mario Bros video data set.
Figure 5: Qualitative video sequence reconstruction for Super Mario Bros and Coil-100 data sets.
...and 2 more figures

Theorems & Definitions (4)

Theorem 1
Theorem 2
Theorem 3
Lemma 1

Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels

TL;DR

Abstract

Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (4)