Table of Contents
Fetching ...

Deep Submodular Peripteral Networks

Gantavya Bhatt, Arnav Das, Jeff Bilmes

TL;DR

This paper introduces deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges.

Abstract

Submodular functions, crucial for various applications, often lack practical learning methods for their acquisition. Seemingly unrelated, learning a scaling from oracles offering graded pairwise preferences (GPC) is underexplored, despite a rich history in psychometrics. In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges. We introduce newly devised GPC-style ``peripteral'' loss which leverages numerically graded relationships between pairs of objects (sets in our case). Unlike traditional contrastive learning, or RHLF preference ranking, our method utilizes graded comparisons, extracting more nuanced information than just binary-outcome comparisons, and contrasts sets of any size (not just two). We also define a novel suite of automatic sampling strategies for training, including active-learning inspired submodular feedback. We demonstrate DSPNs' efficacy in learning submodularity from a costly target submodular function and demonstrate its superiority both for experimental design and online streaming applications.

Deep Submodular Peripteral Networks

TL;DR

This paper introduces deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges.

Abstract

Submodular functions, crucial for various applications, often lack practical learning methods for their acquisition. Seemingly unrelated, learning a scaling from oracles offering graded pairwise preferences (GPC) is underexplored, despite a rich history in psychometrics. In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges. We introduce newly devised GPC-style ``peripteral'' loss which leverages numerically graded relationships between pairs of objects (sets in our case). Unlike traditional contrastive learning, or RHLF preference ranking, our method utilizes graded comparisons, extracting more nuanced information than just binary-outcome comparisons, and contrasts sets of any size (not just two). We also define a novel suite of automatic sampling strategies for training, including active-learning inspired submodular feedback. We demonstrate DSPNs' efficacy in learning submodularity from a costly target submodular function and demonstrate its superiority both for experimental design and online streaming applications.
Paper Structure (30 sections, 11 theorems, 35 equations, 10 figures, 6 tables)

This paper contains 30 sections, 11 theorems, 35 equations, 10 figures, 6 tables.

Key Result

Lemma 0

The weighted matroid rank function $\text{rank}_{\mathcal{M}, m}(\cdot)$ for matroid $\mathcal{M} = (V,\mathcal{I})$ with any non-negative vector $m \in \mathbb{R}_+^{|V|}$ is permutation invariant.

Figures (10)

  • Figure 1: The structure of a DSPN and the control flow of how a DSPN is trained; parameters are shared between the DSPNs processing E and M sets.
  • Figure 2: Passive Sets. We consider a simple 2D ground set with 5 clusters/classes, as indicated by the colors. The various types of passively sampled sets are depicted (discussed in \ref{['sec:what-subsets-train']}). Type-I homogeneous sets are randomly sampled from a single class, while Type-I heterogeneous sets are sampled from the full ground set. Meanwhile, Type-II restricts the ground set to a subset of classes and samples "clumps" from each of the sampled classes to construct the homogeneous set, and diverse sets from each class to create the heterogeneous sets. Intuitively, using Type-I/II allows the learnt DSPN to model intraclass/interclass respectively.
  • Figure 3: Set Pairing. We actively sample sets by optimizing the DSPN as it is being learnt or optionally the target function. The depicted graph demonstrates how the actively sampled sets are integrated into $\mathcal{D} = \{(E_i, M_i)\}_i^N$. Th red/blue vertices refer to passively/actively sampled sets. An edge between two vertices indicates that sets from the categories denoted by the vertices are used to create $(E,M)$ pairs to train the DSPN. Dashed edges represent non-critical pairs.
  • Figure 4: Transfer. We compare different loss functions in terms of their effectiveness at training a DSPN.
  • Figure 7: Peripteral loss of ${\delta}$ for different values and sign of margin ${\Delta}$ and the corresponding gradient.
  • ...and 5 more figures

Theorems & Definitions (22)

  • Lemma 0: Permutation Invariance of Weighted Matroid Rank
  • Theorem 1: A DSPN is monotone non-decreasing submodular
  • Corollary 1: Submodular Preservation of Weighted Matroid Rank Aggregators
  • Corollary 1: DSPN Family
  • Definition 2: Matroid
  • Definition 3: Weighted Matroid Rank Function
  • Definition 4: Permutation Invariance deepset
  • Lemma 4: Permutation Invariance of Weighted Matroid Rank
  • proof
  • Proposition 5
  • ...and 12 more