Table of Contents
Fetching ...

Multi-label Learning with Random Circular Vectors

Ken Nishida, Kojiro Machi, Kazuma Onishi, Katsuhiko Hayashi, Hidetaka Kamigaito

TL;DR

This work tackles extreme multi-label classification (XMC) by replacing high-dimensional real-valued HRR outputs with low-dimensional circular vectors, enabling compact label encoding in neural networks. The proposed CHRR framework uses complex-valued, angle-based vector elements that avoid projection normalization, improving label encoding capacity and retrieval while reducing output-layer size by up to 99%. The authors demonstrate, through theoretical retrieval and variance analyses and empirical results on four XMC datasets (with and without XLNet features), that CHRR outperforms real-valued HRR baselines and competes with or surpasses several traditional XMC methods, particularly for larger label sets. The work suggests a practical, scalable path to integrating circular vector representations into diverse neural architectures, with future extensions to LSTM/Transformer models and broader vector-symbolic systems.

Abstract

The extreme multi-label classification~(XMC) task involves learning a classifier that can predict from a large label set the most relevant subset of labels for a data instance. While deep neural networks~(DNNs) have demonstrated remarkable success in XMC problems, the task is still challenging because it must deal with a large number of output labels, which make the DNN training computationally expensive. This paper addresses the issue by exploring the use of random circular vectors, where each vector component is represented as a complex amplitude. In our framework, we can develop an output layer and loss function of DNNs for XMC by representing the final output layer as a fully connected layer that directly predicts a low-dimensional circular vector encoding a set of labels for a data instance. We conducted experiments on synthetic datasets to verify that circular vectors have better label encoding capacity and retrieval ability than normal real-valued vectors. Then, we conducted experiments on actual XMC datasets and found that these appealing properties of circular vectors contribute to significant improvements in task performance compared with a previous model using random real-valued vectors, while reducing the size of the output layers by up to 99%.

Multi-label Learning with Random Circular Vectors

TL;DR

This work tackles extreme multi-label classification (XMC) by replacing high-dimensional real-valued HRR outputs with low-dimensional circular vectors, enabling compact label encoding in neural networks. The proposed CHRR framework uses complex-valued, angle-based vector elements that avoid projection normalization, improving label encoding capacity and retrieval while reducing output-layer size by up to 99%. The authors demonstrate, through theoretical retrieval and variance analyses and empirical results on four XMC datasets (with and without XLNet features), that CHRR outperforms real-valued HRR baselines and competes with or surpasses several traditional XMC methods, particularly for larger label sets. The work suggests a practical, scalable path to integrating circular vector representations into diverse neural architectures, with future extensions to LSTM/Transformer models and broader vector-symbolic systems.

Abstract

The extreme multi-label classification~(XMC) task involves learning a classifier that can predict from a large label set the most relevant subset of labels for a data instance. While deep neural networks~(DNNs) have demonstrated remarkable success in XMC problems, the task is still challenging because it must deal with a large number of output labels, which make the DNN training computationally expensive. This paper addresses the issue by exploring the use of random circular vectors, where each vector component is represented as a complex amplitude. In our framework, we can develop an output layer and loss function of DNNs for XMC by representing the final output layer as a fully connected layer that directly predicts a low-dimensional circular vector encoding a set of labels for a data instance. We conducted experiments on synthetic datasets to verify that circular vectors have better label encoding capacity and retrieval ability than normal real-valued vectors. Then, we conducted experiments on actual XMC datasets and found that these appealing properties of circular vectors contribute to significant improvements in task performance compared with a previous model using random real-valued vectors, while reducing the size of the output layers by up to 99%.
Paper Structure (18 sections, 6 equations, 27 figures, 3 tables)

This paper contains 18 sections, 6 equations, 27 figures, 3 tables.

Figures (27)

  • Figure 1: The unit circle in the complex plane with coordinates. The angle $\phi$ represents an element of the circular vector $\bar{\phi}$.
  • Figure : (a) HRR(w/Proj)
  • Figure : (a) Variance
  • Figure : (a) CHRR
  • Figure : (a) Wiki10-31K P@5
  • ...and 22 more figures