Hardness of Maximum Likelihood Learning of DPPs

Elena Grigorescu; Brendan Juba; Karl Wimmer; Ning Xie

Hardness of Maximum Likelihood Learning of DPPs

Elena Grigorescu, Brendan Juba, Karl Wimmer, Ning Xie

TL;DR

This work proves Kulesza's conjecture and proves the following stronger hardness of approximation result: even computing a 1-O(\frac{1}{\log^9{N}})\right)-approximation to the maximum log-likelihood of a DPP on a ground set of N elements is NP-complete.

Abstract

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In these applications, a set of parameters that maximize the likelihood of the data is typically desirable. The algorithms used for this task to date either optimize over a limited family of DPPs, or use local improvement heuristics that do not provide theoretical guarantees of optimality. In his seminal work on DPPs in Machine Learning, Kulesza (2011) conjectured that the problem is NP-complete. The lack of a formal proof prompted Brunel et al. (COLT 2017) to suggest that, in opposition to Kulesza's conjecture, there might exist a polynomial-time algorithm for computing a maximum-likelihood DPP. They also presented some preliminary evidence supporting a conjecture that they suggested might lead to such an algorithm. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximation to the maximum log-likelihood of a DPP on a ground set of $N$ elements is NP-complete. From a technical perspective, we reduce the problem of approximating the maximum log-likelihood of a DPP to solving a gap instance of a \textsc{$3$-Coloring} problem on a hypergraph. This hypergraph is based on the bounded-degree construction of Bogdanov et al. (FOCS 2002), which we enhance using the strong expanders of Alon and Capalbo (FOCS 2007). We demonstrate that if a rank-$3$ DPP achieves near-optimal log-likelihood, its marginal kernel must encode an almost perfect ``vector-coloring" of the hypergraph. Finally, we show that these continuous vectors can be decoded into a proper $3$-coloring after removing a small fraction of ``noisy" edges.

Hardness of Maximum Likelihood Learning of DPPs

TL;DR

Abstract

-approximation to the maximum log-likelihood of a DPP on a ground set of

elements is NP-complete. From a technical perspective, we reduce the problem of approximating the maximum log-likelihood of a DPP to solving a gap instance of a \textsc{

-Coloring} problem on a hypergraph. This hypergraph is based on the bounded-degree construction of Bogdanov et al. (FOCS 2002), which we enhance using the strong expanders of Alon and Capalbo (FOCS 2007). We demonstrate that if a rank-

DPP achieves near-optimal log-likelihood, its marginal kernel must encode an almost perfect ``vector-coloring" of the hypergraph. Finally, we show that these continuous vectors can be decoded into a proper

-coloring after removing a small fraction of ``noisy" edges.

Paper Structure (51 sections, 25 theorems, 107 equations, 7 figures)

This paper contains 51 sections, 25 theorems, 107 equations, 7 figures.

Introduction
Maximum likelihood estimation.
Our results
Our approach and techniques
Algorithmic results.
Related work
Learning DPPs.
Vector coloring problems.
Matrix completion problem.
Organization of the paper
Maximum likelihood learning of DPP and our main hardness result
Preliminaries
Matrix analysis.
Discrete determinantal point processes.
Maximum Likelihood Learning of DPPs
...and 36 more sections

Key Result

Theorem 1

There is a ground set of size $N$ such that it is NP-hard to $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximate the maximum DPP log-likelihood value of a sample set.

Figures (7)

Figure 1: High level overview of our reductions.
Figure 2: The overall argumentative structure of the proof of the soundness theorem ($G'_{\phi}$ is an intermediate graph between $G_{\phi}$ and $G"_{\phi}$, and $G"_{\phi}$ is $\epsilon'$-close to $G_{\phi}$)
Figure 3: Literal blocks. This gadget enforces that the $j^{\text{th}}$ copy of literal $x_i$ and $\bar{x}_i$ will always be assigned opposite truth values, as long as the assignments of True, False and Dummy nodes in $x_i^{(j)}$-block and $\bar{x}_i^{(j)}$-block are consistent with their corresponding nodes in the rest of the graph.
Figure 4: Two basic gadgets of BOT graphs.
Figure 5: The expander part of BOT graph.
...and 2 more figures

Theorems & Definitions (62)

Theorem 1: Informal Statement of the Main Theorem
Remark 1
Theorem 2: Informal Statement of the Approximation Algorithm
Theorem 3: Main
Lemma 1: Has00
Lemma 2: BOT02
Definition 1: Very strong expanders AC07
Theorem 4: AC07
Theorem 5
Remark 2
...and 52 more

Hardness of Maximum Likelihood Learning of DPPs

TL;DR

Abstract

Hardness of Maximum Likelihood Learning of DPPs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (62)