GPTopic: Dynamic and Interactive Topic Representations
Arik Reuter, Bishnu Khadka, Anton Thielmann, Christoph Weisser, Sebastian Fischer, Benjamin Säfken
TL;DR
This work addresses the limited interpretability and static nature of conventional top-word topic representations by introducing GPTopic, an LLM-assisted framework for dynamic, interactive topic representations. GPTopic combines embedding-based topic extraction (UMAP for dimensionality reduction and HDBSCAN for clustering with optional fixed-topic merges), with LLM-generated topic names and descriptions informed by large top-word sets. It features a chat-based interface and Retrieval-Augmented Generation to support question answering, topic comparisons, and fine-grained topic refinements (splitting, merging, deleting) driven by user prompts. The approach aims to democratize topic analysis, making it more accessible and adaptable across domains, with a public implementation available on GitHub. Overall, GPTopic enhances interpretability, interactivity, and usability of topic representations in large text corpora.
Abstract
Topic modeling seems to be almost synonymous with generating lists of top words to represent topics within large text corpora. However, deducing a topic from such list of individual terms can require substantial expertise and experience, making topic modelling less accessible to people unfamiliar with the particularities and pitfalls of top-word interpretation. A topic representation limited to top-words might further fall short of offering a comprehensive and easily accessible characterization of the various aspects, facets and nuances a topic might have. To address these challenges, we introduce GPTopic, a software package that leverages Large Language Models (LLMs) to create dynamic, interactive topic representations. GPTopic provides an intuitive chat interface for users to explore, analyze, and refine topics interactively, making topic modeling more accessible and comprehensive. The corresponding code is available here: https://github.com/ArikReuter/TopicGPT.
