Table of Contents
Fetching ...

Feature Qualification by Deep Nets: A Constructive Approach

Feilong Cao, Shao-Bo Lin

TL;DR

A central problem is to quantify which data features deep nets extract while preserving approximation power. The authors construct a linear deep-net-operator (DNO) using deep sigmoid nets to achieve almost optimal approximation rates for functions that are both $τ$-radial (in $\mathcal{R}_τ$) and $α$-smooth in the Lip$^{α,c,ν}$ sense, with conditional qualification that binds the two features. The key result is an explicit operator $G_{n,f,ε}$ with an error bound $||f-G_{n,f,ε}|| ≤ 2τ + 2d ω(g_f, ε) + ((10 e^2-3)/e^2) ω(g_f,1/n) + 3 e^{-n} (||f|| + τ)$; under suitable choices of ε, τ, ν the rate becomes $O(n^{-α})$, making $G_{n,ε}$ an $α$-almost optimal DNO for the class Lip^{α,c,ν} ∩ mathcal{R}_τ$. The paper also proves two conditional qualification theorems showing that the same DNO can quantify the smoothness given radialness and reflect radialness given smoothness, and a Bernstein-type lower bound for shallow nets that underscores the necessity of depth. Altogether, the results provide a constructive, training-free pathway to certify and understand feature qualification in deep nets, with potential implications for interpretability and architecture design.

Abstract

The great success of deep learning has stimulated avid research activities in verifying the power of depth in theory, a common consensus of which is that deep net are versatile in approximating and learning numerous functions. Such a versatility certainly enhances the understanding of the power of depth, but makes it difficult to judge which data features are crucial in a specific learning task. This paper proposes a constructive approach to equip deep nets for the feature qualification purpose. Using the product-gate nature and localized approximation property of deep nets with sigmoid activation (deep sigmoid nets), we succeed in constructing a linear deep net operator that possesses optimal approximation performance in approximating smooth and radial functions. Furthermore, we provide theoretical evidences that the constructed deep net operator is capable of qualifying multiple features such as the smoothness and radialness of the target functions.

Feature Qualification by Deep Nets: A Constructive Approach

TL;DR

A central problem is to quantify which data features deep nets extract while preserving approximation power. The authors construct a linear deep-net-operator (DNO) using deep sigmoid nets to achieve almost optimal approximation rates for functions that are both -radial (in ) and -smooth in the Lip sense, with conditional qualification that binds the two features. The key result is an explicit operator with an error bound ; under suitable choices of ε, τ, ν the rate becomes , making an -almost optimal DNO for the class Lip^{α,c,ν} ∩ mathcal{R}_τ$. The paper also proves two conditional qualification theorems showing that the same DNO can quantify the smoothness given radialness and reflect radialness given smoothness, and a Bernstein-type lower bound for shallow nets that underscores the necessity of depth. Altogether, the results provide a constructive, training-free pathway to certify and understand feature qualification in deep nets, with potential implications for interpretability and architecture design.

Abstract

The great success of deep learning has stimulated avid research activities in verifying the power of depth in theory, a common consensus of which is that deep net are versatile in approximating and learning numerous functions. Such a versatility certainly enhances the understanding of the power of depth, but makes it difficult to judge which data features are crucial in a specific learning task. This paper proposes a constructive approach to equip deep nets for the feature qualification purpose. Using the product-gate nature and localized approximation property of deep nets with sigmoid activation (deep sigmoid nets), we succeed in constructing a linear deep net operator that possesses optimal approximation performance in approximating smooth and radial functions. Furthermore, we provide theoretical evidences that the constructed deep net operator is capable of qualifying multiple features such as the smoothness and radialness of the target functions.

Paper Structure

This paper contains 6 sections, 119 equations, 1 table.

Theorems & Definitions (10)

  • proof
  • proof
  • proof : Proof of Theorem \ref{['theorem3']}
  • proof : Proof of Lemma \ref{['Lemma:lip']}
  • proof : Proof of Theorem \ref{['directTheorem']}
  • proof : Proof of Theorem \ref{['Theorem:qualification-class']}
  • proof : Proof of Theorem \ref{['Theorem:qualification-class-2']}
  • proof : Proof of Lemma \ref{['lemma7']}
  • proof : Proof of Lemma \ref{['lemma3-sec2']}
  • proof : Proof of Lemma \ref{['lemma4-sec2']}