Labeled Interactive Topic Models
Kyle Seelman, Mozhi Zhang, Jordan Boyd-Graber
TL;DR
This work addresses the gap where neural topic models lack intuitive user-driven guidance. It introduces Interactive Neural Topic Modeling (i-NTM), enabling label-based updates to topic embeddings via two mechanisms: learning-adjustable embeddings during training and post-training adjustments that reweight topic-word distributions with a label-aware formulation. The authors provide a user interface, automatic metrics, and a human study demonstrating improved retrieval of relevant documents when users label topics, validating practical value in time-sensitive information needs. The approach enhances interpretability and task relevance of neural topics and opens avenues to integrate user feedback with neural representations, potentially extending to LLM-based topic modeling. $\alpha_k^{new} = \lambda( w_k - \alpha_k^{old}) + (1-\lambda)\alpha_k^{old}$ illustrates the core embedding-shift idea behind label-driven topic refinement.
Abstract
Topic models are valuable for understanding extensive document collections, but they don't always identify the most relevant topics. Classical probabilistic and anchor-based topic models offer interactive versions that allow users to guide the models towards more pertinent topics. However, such interactive features have been lacking in neural topic models. To correct this lacuna, we introduce a user-friendly interaction for neural topic models. This interaction permits users to assign a word label to a topic, leading to an update in the topic model where the words in the topic become closely aligned with the given label. Our approach encompasses two distinct kinds of neural topic models. The first includes models where topic embeddings are trainable and evolve during the training process. The second kind involves models where topic embeddings are integrated post-training, offering a different approach to topic refinement. To facilitate user interaction with these neural topic models, we have developed an interactive interface. This interface enables users to engage with and re-label topics as desired. We evaluate our method through a human study, where users can relabel topics to find relevant documents. Using our method, user labeling improves document rank scores, helping to find more relevant documents to a given query when compared to no user labeling.
