Sequential learning on a Tensor Network Born machine with Trainable Token Embedding
Wanda Hou, Miao Li, Yi-Zhuang You
TL;DR
The paper addresses modeling discrete sequential data with quantum-inspired Born machines built on matrix product states. It introduces trainable POVM embeddings via QR-based factorization to replace fixed one-hot token indices, expanding the effective operator space and enabling higher physical dimensions. Using an isometric MPS backbone, the model yields tractable log-likelihood and allows autoregressive sampling in arbitrary orders, with the probability expressed as $p(x) = ⟨Ψ_θ| ⊗_i M_γ(x_i) |Ψ_θ⟩$. Empirical results on RNA sequences show that larger physical dimensions improve NLL, single-site probabilities, and local correlations, with the POVM-based model outperforming one-hot baselines and achieving competitive performance against GPT-2 on marginal statistics. The work highlights potential quantum hardware pathways, including mapping the MPS to quantum circuits for QCBM-style sampling, and outlines future directions to handle variable-length data and continuous extensions.
Abstract
Generative models aim to learn the probability distributions underlying data, enabling the generation of new, realistic samples. Quantum inspired generative models, such as Born machines based on the matrix product state framework, have demonstrated remarkable capabilities in unsupervised learning tasks. This study advances the Born machine paradigm by introducing trainable token embeddings through positive operator valued measurements, replacing the traditional approach of static tensor indices. Key technical innovations include encoding tokens as quantum measurement operators with trainable parameters and leveraging QR decomposition to adjust the physical dimensions of the MPS. This approach maximizes the utilization of operator space and enhances the model's expressiveness. Empirical results on RNA data demonstrate that the proposed method significantly reduces negative log likelihood compared to one hot embeddings, with higher physical dimensions further enhancing single site probabilities and multi site correlations. The model also outperforms GPT2 in single site estimation and achieves competitive correlation modeling, showcasing the potential of trainable POVM embeddings for complex data correlations in quantum inspired sequence modeling.
