Tensor-Train Point Cloud Compression and Efficient Approximate Nearest-Neighbor Search
Georgii Novikov, Alexander Gneushev, Alexey Kadeishvili, Ivan Oseledets
TL;DR
This work tackles efficient storage and fast retrieval in large vector databases by employing tensor-train (TT) low-rank decompositions to compress point clouds. It introduces a probabilistic compression framework that uses the Sliced Wasserstein loss and a Nearest-Neighbor Distance loss to train TT cores, achieving order-invariant representations with respect to point ordering. A key insight is the emergence of a hierarchical TT structure, enabling a beam-search-like ANN method and making TT suitable for out-of-distribution detection via distance-based scores. Experimental results on MVTEC AD and a Deep1B subset show memory savings and improved pixel-level metrics (and competitive image-level metrics) relative to coreset-based approaches, along with a proof-of-concept TT-based indexing for ANN that outperforms a baseline method in recall across several ranks.
Abstract
Nearest-neighbor search in large vector databases is crucial for various machine learning applications. This paper introduces a novel method using tensor-train (TT) low-rank tensor decomposition to efficiently represent point clouds and enable fast approximate nearest-neighbor searches. We propose a probabilistic interpretation and utilize density estimation losses like Sliced Wasserstein to train TT decompositions, resulting in robust point cloud compression. We reveal an inherent hierarchical structure within TT point clouds, facilitating efficient approximate nearest-neighbor searches. In our paper, we provide detailed insights into the methodology and conduct comprehensive comparisons with existing methods. We demonstrate its effectiveness in various scenarios, including out-of-distribution (OOD) detection problems and approximate nearest-neighbor (ANN) search tasks.
