A Family of Distributions of Random Subsets for Controlling Positive and Negative Dependence
Takahiro Kawashima, Hideitsu Hino
TL;DR
The paper introduces discrete kernel point processes (DKPPs) as a flexible family of distributions over random subsets on the powerset $2^\mathcal{Y}$, parameterized by a kernel matrix $\bm{L}$ and a scalar function $\phi$ to control positive and negative dependence. DKPPs encompass DPPs (via $\phi=\log$) and Boltzmann machines (via quadratic $\phi$), enabling smooth transitions between repulsive and attractive interactions through a Box–Cox style transformation $\phi_{\beta,\lambda}$. It develops practical inference and learning tools, including mean-field approximations with Rao–Blackwellization for normalizing constants, ratio matching for kernel learning, and sampling strategies (MCMC/Langevin), alongside efficient marginal/conditional probability computations. Empirical results demonstrate controllable dependence, effective subset acquisition, and advantageous learning behavior on datasets like MNIST and Amazon Baby Registry, highlighting the method’s applicability and scalability. Overall, DKPPs offer a principled and pragmatic pathway to bridging well-known discrete point processes and probabilistic models with tractable computation for real-world tasks requiring diverse subset selections.
Abstract
Positive and negative dependence are fundamental concepts that characterize the attractive and repulsive behavior of random subsets. Although some probabilistic models are known to exhibit positive or negative dependence, it is challenging to seamlessly bridge them with a practicable probabilistic model. In this study, we introduce a new family of distributions, named the discrete kernel point process (DKPP), which includes determinantal point processes and parts of Boltzmann machines. We also develop some computational methods for probabilistic operations and inference with DKPPs, such as calculating marginal and conditional probabilities and learning the parameters. Our numerical experiments demonstrate the controllability of positive and negative dependence and the effectiveness of the computational methods for DKPPs.
