A Multiscale Geometric Method for Capturing Relational Topic Alignment
Conrad D. Hougen, Karl T. Pazdernik, Alfred O. Hero
TL;DR
The paper addresses the challenge of tracking niche, time-evolving topics within co-authorship networks using interpretable models. It introduces MSTML, a multiscale geometric framework that fuses time-sliced LDA ensembles with a topic-space dendrogram guided by Hellinger distances and Ward's linkage, visualized via PHATE embeddings. The approach yields smooth temporal topic alignment and interpretable visualizations, identifying rare-topic structure that transformer-based methods often overlook, albeit with some trade-offs in topic coherence. Overall, MSTML offers a principled, scalable alternative for monitoring scientific novelty and topic drift across time and collaboration networks.
Abstract
Interpretable topic modeling is essential for tracking how research interests evolve within co-author communities. In scientific corpora, where novelty is prized, identifying underrepresented niche topics is particularly important. However, contemporary models built from dense transformer embeddings tend to miss rare topics and therefore also fail to capture smooth temporal alignment. We propose a geometric method that integrates multimodal text and co-author network data, using Hellinger distances and Ward's linkage to construct a hierarchical topic dendrogram. This approach captures both local and global structure, supporting multiscale learning across semantic and temporal dimensions. Our method effectively identifies rare-topic structure and visualizes smooth topic drift over time. Experiments highlight the strength of interpretable bag-of-words models when paired with principled geometric alignment.
