Determinantal Point Process as an alternative to NMS
Samik Some, Mithun Das Gupta, Vinay P. Namboodiri
TL;DR
The paper presents a determinantal point process (DPP) based replacement for the first NMS step after the Region Proposal Network in object detection. By constructing an L-ensemble $\mathbf{L}=\alpha [e^{\mathbf{s}} e^{\mathbf{s}^T}] \odot \mathrm{IoU}$ from RPN scores $\mathbf{s}$ and pairwise overlaps $\mathrm{IoU}$, it maximizes $\log\det \mathbf{L}_Y$ via a greedy, submodular optimization to select a diverse subset of proposals. Experiments on MS-COCO and PASCAL VOC show that the DPP-based method achieves competitive or improved AP metrics compared to GreedyNMS and other baselines, with notable gains in IoU-aware recall and in crowded scenes. The approach preserves the post-processing nature of NMS, requires no additional training, and can be plugged into existing pipelines with modest changes, offering a practical route to more diverse and accurate detections. $\log\det$ objectives and the PSD property of $\mathbf{L}$ underpin the theoretical guarantees and stable behavior of the selection process.
Abstract
We present a determinantal point process (DPP) inspired alternative to non-maximum suppression (NMS) which has become an integral step in all state-of-the-art object detection frameworks. DPPs have been shown to encourage diversity in subset selection problems. We pose NMS as a subset selection problem and posit that directly incorporating DPP like framework can improve the overall performance of the object detection system. We propose an optimization problem which takes the same inputs as NMS, but introduces a novel sub-modularity based diverse subset selection functional. Our results strongly indicate that the modifications proposed in this paper can provide consistent improvements to state-of-the-art object detection pipelines.
