Table of Contents
Fetching ...

Hardness of Maximum Likelihood Learning of DPPs

Elena Grigorescu, Brendan Juba, Karl Wimmer, Ning Xie

TL;DR

This work proves Kulesza's conjecture and proves the following stronger hardness of approximation result: even computing a 1-O(\frac{1}{\log^9{N}})\right)-approximation to the maximum log-likelihood of a DPP on a ground set of N elements is NP-complete.

Abstract

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In these applications, a set of parameters that maximize the likelihood of the data is typically desirable. The algorithms used for this task to date either optimize over a limited family of DPPs, or use local improvement heuristics that do not provide theoretical guarantees of optimality. In his seminal work on DPPs in Machine Learning, Kulesza (2011) conjectured that the problem is NP-complete. The lack of a formal proof prompted Brunel et al. (COLT 2017) to suggest that, in opposition to Kulesza's conjecture, there might exist a polynomial-time algorithm for computing a maximum-likelihood DPP. They also presented some preliminary evidence supporting a conjecture that they suggested might lead to such an algorithm. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximation to the maximum log-likelihood of a DPP on a ground set of $N$ elements is NP-complete. From a technical perspective, we reduce the problem of approximating the maximum log-likelihood of a DPP to solving a gap instance of a \textsc{$3$-Coloring} problem on a hypergraph. This hypergraph is based on the bounded-degree construction of Bogdanov et al. (FOCS 2002), which we enhance using the strong expanders of Alon and Capalbo (FOCS 2007). We demonstrate that if a rank-$3$ DPP achieves near-optimal log-likelihood, its marginal kernel must encode an almost perfect ``vector-coloring" of the hypergraph. Finally, we show that these continuous vectors can be decoded into a proper $3$-coloring after removing a small fraction of ``noisy" edges.

Hardness of Maximum Likelihood Learning of DPPs

TL;DR

This work proves Kulesza's conjecture and proves the following stronger hardness of approximation result: even computing a 1-O(\frac{1}{\log^9{N}})\right)-approximation to the maximum log-likelihood of a DPP on a ground set of N elements is NP-complete.

Abstract

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In these applications, a set of parameters that maximize the likelihood of the data is typically desirable. The algorithms used for this task to date either optimize over a limited family of DPPs, or use local improvement heuristics that do not provide theoretical guarantees of optimality. In his seminal work on DPPs in Machine Learning, Kulesza (2011) conjectured that the problem is NP-complete. The lack of a formal proof prompted Brunel et al. (COLT 2017) to suggest that, in opposition to Kulesza's conjecture, there might exist a polynomial-time algorithm for computing a maximum-likelihood DPP. They also presented some preliminary evidence supporting a conjecture that they suggested might lead to such an algorithm. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a -approximation to the maximum log-likelihood of a DPP on a ground set of elements is NP-complete. From a technical perspective, we reduce the problem of approximating the maximum log-likelihood of a DPP to solving a gap instance of a \textsc{-Coloring} problem on a hypergraph. This hypergraph is based on the bounded-degree construction of Bogdanov et al. (FOCS 2002), which we enhance using the strong expanders of Alon and Capalbo (FOCS 2007). We demonstrate that if a rank- DPP achieves near-optimal log-likelihood, its marginal kernel must encode an almost perfect ``vector-coloring" of the hypergraph. Finally, we show that these continuous vectors can be decoded into a proper -coloring after removing a small fraction of ``noisy" edges.
Paper Structure (51 sections, 25 theorems, 107 equations, 7 figures)

This paper contains 51 sections, 25 theorems, 107 equations, 7 figures.

Key Result

Theorem 1

There is a ground set of size $N$ such that it is NP-hard to $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximate the maximum DPP log-likelihood value of a sample set.

Figures (7)

  • Figure 1: High level overview of our reductions.
  • Figure 2: The overall argumentative structure of the proof of the soundness theorem ($G'_{\phi}$ is an intermediate graph between $G_{\phi}$ and $G"_{\phi}$, and $G"_{\phi}$ is $\epsilon'$-close to $G_{\phi}$)
  • Figure 3: Literal blocks. This gadget enforces that the $j^{\text{th}}$ copy of literal $x_i$ and $\bar{x}_i$ will always be assigned opposite truth values, as long as the assignments of True, False and Dummy nodes in $x_i^{(j)}$-block and $\bar{x}_i^{(j)}$-block are consistent with their corresponding nodes in the rest of the graph.
  • Figure 4: Two basic gadgets of BOT graphs.
  • Figure 5: The expander part of BOT graph.
  • ...and 2 more figures

Theorems & Definitions (62)

  • Theorem 1: Informal Statement of the Main Theorem
  • Remark 1
  • Theorem 2: Informal Statement of the Approximation Algorithm
  • Theorem 3: Main
  • Lemma 1: Has00
  • Lemma 2: BOT02
  • Definition 1: Very strong expanders AC07
  • Theorem 4: AC07
  • Theorem 5
  • Remark 2
  • ...and 52 more