Deep Submodular Peripteral Networks

Gantavya Bhatt; Arnav Das; Jeff Bilmes

Deep Submodular Peripteral Networks

Gantavya Bhatt, Arnav Das, Jeff Bilmes

TL;DR

This paper introduces deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges.

Abstract

Submodular functions, crucial for various applications, often lack practical learning methods for their acquisition. Seemingly unrelated, learning a scaling from oracles offering graded pairwise preferences (GPC) is underexplored, despite a rich history in psychometrics. In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges. We introduce newly devised GPC-style ``peripteral'' loss which leverages numerically graded relationships between pairs of objects (sets in our case). Unlike traditional contrastive learning, or RHLF preference ranking, our method utilizes graded comparisons, extracting more nuanced information than just binary-outcome comparisons, and contrasts sets of any size (not just two). We also define a novel suite of automatic sampling strategies for training, including active-learning inspired submodular feedback. We demonstrate DSPNs' efficacy in learning submodularity from a costly target submodular function and demonstrate its superiority both for experimental design and online streaming applications.

Deep Submodular Peripteral Networks

TL;DR

Abstract

Paper Structure (30 sections, 11 theorems, 35 equations, 10 figures, 6 tables)

This paper contains 30 sections, 11 theorems, 35 equations, 10 figures, 6 tables.

Introduction and Background
Deep Submodular Peripteral Networks
Peripteral Loss
Augmented Loss for Augmented Data
Final Loss
Sampling (E,M) Pairs
Experiments
Transfer from Target to Learner
Application to Experimental Design
Ablations
Conclusion and Future Work
Acknowledgments
Other Related Work
Relationship to contrastive learning
Learning Non-submodular Set functions
...and 15 more sections

Key Result

Lemma 0

The weighted matroid rank function $\text{rank}_{\mathcal{M}, m}(\cdot)$ for matroid $\mathcal{M} = (V,\mathcal{I})$ with any non-negative vector $m \in \mathbb{R}_+^{|V|}$ is permutation invariant.

Figures (10)

Figure 1: The structure of a DSPN and the control flow of how a DSPN is trained; parameters are shared between the DSPNs processing E and M sets.
Figure 2: Passive Sets. We consider a simple 2D ground set with 5 clusters/classes, as indicated by the colors. The various types of passively sampled sets are depicted (discussed in \ref{['sec:what-subsets-train']}). Type-I homogeneous sets are randomly sampled from a single class, while Type-I heterogeneous sets are sampled from the full ground set. Meanwhile, Type-II restricts the ground set to a subset of classes and samples "clumps" from each of the sampled classes to construct the homogeneous set, and diverse sets from each class to create the heterogeneous sets. Intuitively, using Type-I/II allows the learnt DSPN to model intraclass/interclass respectively.
Figure 3: Set Pairing. We actively sample sets by optimizing the DSPN as it is being learnt or optionally the target function. The depicted graph demonstrates how the actively sampled sets are integrated into $\mathcal{D} = \{(E_i, M_i)\}_i^N$. Th red/blue vertices refer to passively/actively sampled sets. An edge between two vertices indicates that sets from the categories denoted by the vertices are used to create $(E,M)$ pairs to train the DSPN. Dashed edges represent non-critical pairs.
Figure 4: Transfer. We compare different loss functions in terms of their effectiveness at training a DSPN.
Figure 7: Peripteral loss of ${\delta}$ for different values and sign of margin ${\Delta}$ and the corresponding gradient.
...and 5 more figures

Theorems & Definitions (22)

Lemma 0: Permutation Invariance of Weighted Matroid Rank
Theorem 1: A DSPN is monotone non-decreasing submodular
Corollary 1: Submodular Preservation of Weighted Matroid Rank Aggregators
Corollary 1: DSPN Family
Definition 2: Matroid
Definition 3: Weighted Matroid Rank Function
Definition 4: Permutation Invariance deepset
Lemma 4: Permutation Invariance of Weighted Matroid Rank
proof
Proposition 5
...and 12 more

Deep Submodular Peripteral Networks

TL;DR

Abstract

Deep Submodular Peripteral Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (22)