Trading Vector Data in Vector Databases
Jin Cheng, Xiangxiang Dai, Ningning Ding, John C. S. Lui, Jianwei Huang
TL;DR
This work addresses vector data trading in vector databases by framing it as an online learning problem where retrieval configurations and pricing must be learned jointly under uncertain costs and stochastic buyer feedback. It introduces a two-stage hierarchical bandit framework (VTHB) with Stage I (CCB) for adaptive retrieval configuration learning and Stage II (CPB with LAB) for pricing strategy learning, and provides regret guarantees: logarithmic for retrieval and sublinear for pricing, all with polynomial-time complexity. The approach is validated on four real-world datasets, showing significant improvements in cumulative reward and reductions in regret compared to baselines. The framework is generalizable to various indexing methods and supports cross-domain data trading scenarios, offering practical impact for scalable, cost-aware vector data marketplaces.
Abstract
Vector data trading is essential for cross-domain learning with vector databases, yet it remains largely unexplored. We study this problem under online learning, where sellers face uncertain retrieval costs and buyers provide stochastic feedback to posted prices. Three main challenges arise: (1) heterogeneous and partial feedback in configuration learning, (2) variable and complex feedback in pricing learning, and (3) inherent coupling between configuration and pricing decisions. We propose a hierarchical bandit framework that jointly optimizes retrieval configurations and pricing. Stage I employs contextual clustering with confidence-based exploration to learn effective configurations with logarithmic regret. Stage II adopts interval-based price selection with local Taylor approximation to estimate buyer responses and achieve sublinear regret. We establish theoretical guarantees with polynomial time complexity and validate the framework on four real-world datasets, demonstrating consistent improvements in cumulative reward and regret reduction compared with existing methods.
