ResearchArcade: Graph Interface for Academic Tasks

Jingjun Xu; Chongshan Lin; Haofei Yu; Tao Feng; Jiaxuan You

ResearchArcade: Graph Interface for Academic Tasks

Jingjun Xu, Chongshan Lin, Haofei Yu, Tao Feng, Jiaxuan You

TL;DR

ResearchArcade introduces a graph-based interface that unifies multi-source (ArXiv/OpenReview), multi-modal (text, figures, tables), and temporally evolving academic data into a coherent heterogeneous graph. It defines six predictive and generative academic tasks within a two-step target-and-neighborhood framework and demonstrates compatibility with Embedding, GNN, LLM, and Graph World Model backbones. Across tasks, the use of graph structures and cross-source, multi-modal information yields consistent performance gains, and the framework models intra-paper revisions and macro research trends. Ablation studies highlight the critical role of multi-modal data and reveal nuanced effects of review information, while the system’s exportability supports integration with standard modeling pipelines.

Abstract

Academic research generates diverse data sources, and as researchers increasingly use machine learning to assist research tasks, a crucial question arises: Can we build a unified data interface to support the development of machine learning models for various academic tasks? Models trained on such a unified interface can better support human researchers throughout the research process, eventually accelerating knowledge discovery. In this work, we introduce ResearchArcade, a graph-based interface that connects multiple academic data sources, unifies task definitions, and supports a wide range of base models to address key academic challenges. ResearchArcade utilizes a coherent multi-table format with graph structures to organize data from different sources, including academic corpora from ArXiv and peer reviews from OpenReview, while capturing information with multiple modalities, such as text, figures, and tables. ResearchArcade also preserves temporal evolution at both the manuscript and community levels, supporting the study of paper revisions as well as broader research trends over time. Additionally, ResearchArcade unifies diverse academic task definitions and supports various models with distinct input requirements. Our experiments across six academic tasks demonstrate that combining cross-source and multi-modal information enables a broader range of tasks, while incorporating graph structures consistently improves performance over baseline methods. This highlights the effectiveness of ResearchArcade and its potential to advance research progress.

ResearchArcade: Graph Interface for Academic Tasks

TL;DR

Abstract

ResearchArcade: Graph Interface for Academic Tasks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)