Table of Contents
Fetching ...

Composable Coresets for Constrained Determinant Maximization and Beyond

Sepideh Mahabadi, Thuy-Duong Vuong

TL;DR

This work develops composable coresets for determinant maximization under partition and laminar matroid constraints, plus unconstrained and broader experimental design settings. It introduces peeling coresets for the without-repetition case and extends directional height analysis to the $k\ge d$ regime, achieving size $kd$ with $d^{O(d)}$-approximation for $k> d$ and size $sk$ with $k^{2k}$-approximation for $k\le d$. The results generalize to strongly Rayleigh distributions and to other design objectives via spectral spanners, enabling near-linear-time pipelines and practical speedups for MAP-inference in partition settings. Lower bounds show tightness of the size and approximation trade-offs, while the laminar- and partition-matroid constructions yield scalable, composable summaries applicable to large-scale data summarization and design tasks.

Abstract

We study algorithms for construction of composable coresets for the task of Determinant Maximization under partition constraint. Given a point set $V\subset \mathbb{R}^d$ that is partitioned into $s$ groups $V_1,\cdots, V_s$, and integers $k_1,...,k_s$, where $k=\sum_i k_i$, the goal is to pick $k_i$ points from group $V_i$ such that the overall determinant of the picked $k$ points is maximized. Determinant Maximization and its constrained variants have gained a lot of interest for modeling diversity, and have found applications in the context of data summarization. When the cardinality $k$ of the selected set is greater than the dimension $d$, we show a peeling algorithm that gives us a composable coreset of size $kd$ with a provably optimal approximation factor of $d^{O(d)}.$ When $k\leq d$, we show a simple coreset construction with optimal size and approximation factor. As a further application of our technique, we get a composable coreset for determinant maximization under the more general laminar matroid constraints, and a composable coreset for unconstrained determinant maximization in a previously unresolved regime. Our results generalize to all strongly Rayleigh distributions and to several other experimental design problems. As an application, we improve the runtime of the practical local-search based algorithm of [Anari-Vuong--COLT'22] for determinantal maximization under partition constraint from $O(n^{2^s}k^{2^s})$ to $O(n k^{2^s})$, making it only linear on the number of points $n$.

Composable Coresets for Constrained Determinant Maximization and Beyond

TL;DR

This work develops composable coresets for determinant maximization under partition and laminar matroid constraints, plus unconstrained and broader experimental design settings. It introduces peeling coresets for the without-repetition case and extends directional height analysis to the regime, achieving size with -approximation for and size with -approximation for . The results generalize to strongly Rayleigh distributions and to other design objectives via spectral spanners, enabling near-linear-time pipelines and practical speedups for MAP-inference in partition settings. Lower bounds show tightness of the size and approximation trade-offs, while the laminar- and partition-matroid constructions yield scalable, composable summaries applicable to large-scale data summarization and design tasks.

Abstract

We study algorithms for construction of composable coresets for the task of Determinant Maximization under partition constraint. Given a point set that is partitioned into groups , and integers , where , the goal is to pick points from group such that the overall determinant of the picked points is maximized. Determinant Maximization and its constrained variants have gained a lot of interest for modeling diversity, and have found applications in the context of data summarization. When the cardinality of the selected set is greater than the dimension , we show a peeling algorithm that gives us a composable coreset of size with a provably optimal approximation factor of When , we show a simple coreset construction with optimal size and approximation factor. As a further application of our technique, we get a composable coreset for determinant maximization under the more general laminar matroid constraints, and a composable coreset for unconstrained determinant maximization in a previously unresolved regime. Our results generalize to all strongly Rayleigh distributions and to several other experimental design problems. As an application, we improve the runtime of the practical local-search based algorithm of [Anari-Vuong--COLT'22] for determinantal maximization under partition constraint from to , making it only linear on the number of points .
Paper Structure (16 sections, 19 theorems, 53 equations, 1 table)

This paper contains 16 sections, 19 theorems, 53 equations, 1 table.

Key Result

Theorem 2.5

Let $k \leq d$ and $V\subseteq\mathbb{R}^d$. Then any size $k$ local optimum $U$ w.r.t $\det(\cdot)$ inside $V$ approximately preserves the $k$-directional height. That is, for any $(k-1)$-dimensional subspace $H$ where for a point set $P$, we define $d(P,H)=\max_{p\in P} d(p,H)$.

Theorems & Definitions (51)

  • Definition 1.1: $\det_k$
  • Definition 2.1: Local optima
  • Definition 2.2: Partition matroid
  • Definition 2.3: Laminar matroid
  • Definition 2.4: Directional height and $k$-directional height mahabadi2019composable
  • Theorem 2.5: Coreset for $k$-directional height mahabadi2019composable
  • Definition 2.6: Strongly Rayleigh
  • Lemma 2.7
  • Definition 3.1: Value-preserving set
  • Lemma 3.2
  • ...and 41 more