Table of Contents
Fetching ...

Tighter Risk Bounds for Mixtures of Experts

Wissam Akretche, Frédéric LeBlanc, Mario Marchand

TL;DR

Experimental results support the theory, demonstrating that the approach enhances the generalization ability of mixtures of experts and validating the feasibility of imposing LDP on the gating mechanism.

Abstract

In this work, we provide upper bounds on the risk of mixtures of experts by imposing local differential privacy (LDP) on their gating mechanism. These theoretical guarantees are tailored to mixtures of experts that utilize the one-out-of-$n$ gating mechanism, as opposed to the conventional $n$-out-of-$n$ mechanism. The bounds exhibit logarithmic dependence on the number of experts, and encapsulate the dependence on the gating mechanism in the LDP parameter, making them significantly tighter than existing bounds, under reasonable conditions. Experimental results support our theory, demonstrating that our approach enhances the generalization ability of mixtures of experts and validating the feasibility of imposing LDP on the gating mechanism.

Tighter Risk Bounds for Mixtures of Experts

TL;DR

Experimental results support the theory, demonstrating that the approach enhances the generalization ability of mixtures of experts and validating the feasibility of imposing LDP on the gating mechanism.

Abstract

In this work, we provide upper bounds on the risk of mixtures of experts by imposing local differential privacy (LDP) on their gating mechanism. These theoretical guarantees are tailored to mixtures of experts that utilize the one-out-of- gating mechanism, as opposed to the conventional -out-of- mechanism. The bounds exhibit logarithmic dependence on the number of experts, and encapsulate the dependence on the gating mechanism in the LDP parameter, making them significantly tighter than existing bounds, under reasonable conditions. Experimental results support our theory, demonstrating that our approach enhances the generalization ability of mixtures of experts and validating the feasibility of imposing LDP on the gating mechanism.

Paper Structure

This paper contains 11 sections, 10 theorems, 35 equations, 1 table.

Key Result

Theorem 2.2

Let $b > 0$ and $\beta \geq 0$ be real numbers, and suppose that ${\mathcal{F}}$ is a set of functions ${\mathbf f} \colon {\mathcal{X}} \to [-b, b]^n$. Let ${\mathcal{G}}$ be the set of functions ${\mathbf g}: {\mathcal{X}} \to [0, 1]^n$ defined by Then, each ${\mathbf g} \in {\mathcal{G}}$ satisfies $4\beta b$-LDP.

Theorems & Definitions (21)

  • Definition 2.1
  • Theorem 2.2
  • proof
  • Lemma 3.1
  • proof
  • Theorem 3.2: Theorem 2 in mcallester13
  • Theorem 3.3
  • proof
  • Theorem 3.4
  • proof
  • ...and 11 more