Quantum contextual bandits and recommender systems for quantum data
Shrigyan Brahmachari, Josep Lumbreras, Marco Tomamichel
TL;DR
This work frames quantum data recommendation as a quantum contextual bandit problem (QCB) where contexts are Hamiltonians and actions are unknown quantum states. It develops a linear contextual-bandit approach (LinUCB) adapted to the quantum setting by expressing states and observables in a Pauli-like basis, enabling a dimension-reduced, online recommendation of low-energy states. A lower bound shows that no strategy can beat a $Ω(\sqrt{kT} \cdot \min(d,\sqrt{c}))$ scaling, while the proposed Gram-Schmidt–augmented LinUCB achieves near-optimal performance with manageable space complexity $O(k d_{eff}^2)$. The authors demonstrate the method on Ising and generalized cluster Hamiltonians, revealing that recommendations align with Hamiltonian phases and effectively classify phases online. This framework offers a principled, scalable way to select quantum preparations for energy-minimization tasks in NISQ-era workflows and provides a foundation for phase-aware quantum data recommender systems.
Abstract
We study a recommender system for quantum data using the linear contextual bandit framework. In each round, a learner receives an observable (the context) and has to recommend from a finite set of unknown quantum states (the actions) which one to measure. The learner has the goal of maximizing the reward in each round, that is the outcome of the measurement on the unknown state. Using this model we formulate the low energy quantum state recommendation problem where the context is a Hamiltonian and the goal is to recommend the state with the lowest energy. For this task, we study two families of contexts: the Ising model and a generalized cluster model. We observe that if we interpret the actions as different phases of the models then the recommendation is done by classifying the correct phase of the given Hamiltonian and the strategy can be interpreted as an online quantum phase classifier.
