Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings
Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira
TL;DR
Axis Tour addresses the arbitrariness of axis ordering in ICA-transformed word embeddings by proposing an axis-order optimization that maximizes semantic continuity. It constructs axis embeddings from the top-$k$ words per axis and solves a traveling salesman problem to order axes, then reduces dimensionality by projecting consecutive axes weighted by axis skewness. Empirical results across static and dynamic embeddings show improved axis continuity and competitive downstream performance in analogy, similarity, and categorization tasks, with GPT-model evaluations confirming more coherent axis relationships. This work enhances interpretability and usability of ICA-based word embeddings, offering a principled approach to preserve axis similarities in low-dimensional representations.
Abstract
Word embedding is one of the most important components in natural language processing, but interpreting high-dimensional embeddings remains a challenging problem. To address this problem, Independent Component Analysis (ICA) is identified as an effective solution. ICA-transformed word embeddings reveal interpretable semantic axes; however, the order of these axes are arbitrary. In this study, we focus on this property and propose a novel method, Axis Tour, which optimizes the order of the axes. Inspired by Word Tour, a one-dimensional word embedding method, we aim to improve the clarity of the word embedding space by maximizing the semantic continuity of the axes. Furthermore, we show through experiments on downstream tasks that Axis Tour yields better or comparable low-dimensional embeddings compared to both PCA and ICA.
