Table of Contents
Fetching ...

Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method

Yang Yang, Tong Zhang, Jian Wu, Lijie Su

TL;DR

This work tackles dynamic topic analysis in scholarly journals, aiming to robustly identify and track topic evolution with high interpretability. It introduces a two-stage CDNMF framework: Stage 1 uses dynamic NMF to extract window topics via matrices $(X^t,Y^t)$ and forms a baseline dynamic representation $(F^t,H)$, while Stage 2 applies convex NMF by solving $L_3(G,\tilde{H})=\frac{1}{2}\|V-VG\tilde{H}\|_F^2$ with $G,\tilde{H}\ge0$ to refine and stabilize the topic dynamics, producing $\tilde{W}=VG$. On IEEE T-ASE abstracts from 2004–2022, the method uncovers emergent topics like COVID-19 and digital twins and achieves substantial gains in topic ranking stability at sparsity levels $l=0.4,0.6,0.9$, with improvements of $24.51\%$, $56.60\%$, and $36.93\%$, respectively. The results demonstrate improved topic coherence and robustness, and the authors release open-source code to enable reproducibility and practical deployment for dynamic topic monitoring in scholarly corpora.

Abstract

With the rapid advancement of large language models, academic topic identification and topic evolution analysis are crucial for enhancing AI's understanding capabilities. Dynamic topic analysis provides a powerful approach to capturing and understanding the temporal evolution of topics in large-scale datasets. This paper presents a two-stage dynamic topic analysis framework that incorporates convex optimization to improve topic consistency, sparsity, and interpretability. In Stage 1, a two-layer non-negative matrix factorization (NMF) model is employed to extract annual topics and identify key terms. In Stage 2, a convex optimization algorithm refines the dynamic topic structure using the convex NMF (cNMF) model, further enhancing topic integration and stability. Applying the proposed method to IEEE journal abstracts from 2004 to 2022 effectively identifies and quantifies emerging research topics, such as COVID-19 and digital twins. By optimizing sparsity differences in the clustering feature space between traditional and emerging research topics, the framework provides deeper insights into topic evolution and ranking analysis. Moreover, the NMF-cNMF model demonstrates superior stability in topic consistency. At sparsity levels of 0.4, 0.6, and 0.9, the proposed approach improves topic ranking stability by 24.51%, 56.60%, and 36.93%, respectively. The source code (to be open after publication) is available at https://github.com/meetyangyang/CDNMF.

Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method

TL;DR

This work tackles dynamic topic analysis in scholarly journals, aiming to robustly identify and track topic evolution with high interpretability. It introduces a two-stage CDNMF framework: Stage 1 uses dynamic NMF to extract window topics via matrices and forms a baseline dynamic representation , while Stage 2 applies convex NMF by solving with to refine and stabilize the topic dynamics, producing . On IEEE T-ASE abstracts from 2004–2022, the method uncovers emergent topics like COVID-19 and digital twins and achieves substantial gains in topic ranking stability at sparsity levels , with improvements of , , and , respectively. The results demonstrate improved topic coherence and robustness, and the authors release open-source code to enable reproducibility and practical deployment for dynamic topic monitoring in scholarly corpora.

Abstract

With the rapid advancement of large language models, academic topic identification and topic evolution analysis are crucial for enhancing AI's understanding capabilities. Dynamic topic analysis provides a powerful approach to capturing and understanding the temporal evolution of topics in large-scale datasets. This paper presents a two-stage dynamic topic analysis framework that incorporates convex optimization to improve topic consistency, sparsity, and interpretability. In Stage 1, a two-layer non-negative matrix factorization (NMF) model is employed to extract annual topics and identify key terms. In Stage 2, a convex optimization algorithm refines the dynamic topic structure using the convex NMF (cNMF) model, further enhancing topic integration and stability. Applying the proposed method to IEEE journal abstracts from 2004 to 2022 effectively identifies and quantifies emerging research topics, such as COVID-19 and digital twins. By optimizing sparsity differences in the clustering feature space between traditional and emerging research topics, the framework provides deeper insights into topic evolution and ranking analysis. Moreover, the NMF-cNMF model demonstrates superior stability in topic consistency. At sparsity levels of 0.4, 0.6, and 0.9, the proposed approach improves topic ranking stability by 24.51%, 56.60%, and 36.93%, respectively. The source code (to be open after publication) is available at https://github.com/meetyangyang/CDNMF.

Paper Structure

This paper contains 20 sections, 10 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: Number of papers published in IEEE journals
  • Figure 2: Some representative topic modeling methods in dynamic topic analysis
  • Figure 3: An overview of dynamic topic models.
  • Figure 4: Research Framework
  • Figure 5: Dynamic convex NMF
  • ...and 2 more figures