Table of Contents
Fetching ...

What Papers Don't Tell You: Recovering Tacit Knowledge for Automated Paper Reproduction

Lehui Li, Ruining Wang, Haochen Song, Yaoxin Mao, Tong Zhang, Yuyao Wang, Jiayi Fan, Yitong Zhang, Jieping Ye, Chengqi Zhang, Yongshun Gong

TL;DR

This work formalizes this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and proposes \method, a graph-based agent framework with a dedicated mechanism for each.

Abstract

Automated paper reproduction -- generating executable code from academic papers -- is bottlenecked not by information retrieval but by the tacit knowledge that papers inevitably leave implicit. We formalize this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and propose \method, a graph-based agent framework with a dedicated mechanism for each: node-level relation-aware aggregation recovers relational knowledge by analyzing implementation-unit-level reuse and adaptation relationships between the target paper and its citation neighbors; execution-feedback refinement recovers somatic knowledge through iterative debugging driven by runtime signals; and graph-level knowledge induction distills collective knowledge from clusters of papers sharing similar implementations. On an extended ReproduceBench spanning 3 domains, 10 tasks, and 40 recent papers, \method{} achieves an average performance gap of 10.04\% against official implementations, improving over the strongest baseline by 24.68\%. The code will be publicly released upon acceptance; the repository link will be provided in the final version.

What Papers Don't Tell You: Recovering Tacit Knowledge for Automated Paper Reproduction

TL;DR

This work formalizes this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and proposes \method, a graph-based agent framework with a dedicated mechanism for each.

Abstract

Automated paper reproduction -- generating executable code from academic papers -- is bottlenecked not by information retrieval but by the tacit knowledge that papers inevitably leave implicit. We formalize this challenge as the progressive recovery of three types of tacit knowledge -- relational, somatic, and collective -- and propose \method, a graph-based agent framework with a dedicated mechanism for each: node-level relation-aware aggregation recovers relational knowledge by analyzing implementation-unit-level reuse and adaptation relationships between the target paper and its citation neighbors; execution-feedback refinement recovers somatic knowledge through iterative debugging driven by runtime signals; and graph-level knowledge induction distills collective knowledge from clusters of papers sharing similar implementations. On an extended ReproduceBench spanning 3 domains, 10 tasks, and 40 recent papers, \method{} achieves an average performance gap of 10.04\% against official implementations, improving over the strongest baseline by 24.68\%. The code will be publicly released upon acceptance; the repository link will be provided in the final version.
Paper Structure (93 sections, 2 theorems, 18 equations, 10 figures, 3 tables)

This paper contains 93 sections, 2 theorems, 18 equations, 10 figures, 3 tables.

Key Result

Proposition B.1

Fix a target paper $v_t$. For any candidate neighbor $u$, let $R(u)\in\mathbb R$ be its (random) ensemble rank (smaller is better), with $\mu(u):=\mathbb E[R(u)]$ and $\sigma^2(u):=\mathrm{Var}(R(u))<\infty$. For $\lambda>0$, define $s_\lambda(u):=\mu(u)+\lambda\sigma(u)$. (i) Single-neighbor contro (ii) Group-level control (no independence needed). For any pruned set $S=\{u_1,\dots,u_K\}$,

Figures (10)

  • Figure 1: Illustrative examples of the three types of tacit knowledge in automated paper reproduction. Left: Relational---implementation units are reused, adapted, or newly created relative to neighbor papers. Middle: Somatic---execution feedback from a sandbox environment guides iterative refinement. Right: Collective---common practices are induced from clusters of related implementations.
  • Figure 2: Overview of PaperRepro. The framework comprises three stages: SSGP prunes the citation graph to retain implementation-relevant neighbors; node-level relation-aware aggregation analyzes reuse/adapt/new relationships and assembles an initial implementation; execution-feedback refinement and graph-level knowledge induction iteratively improve it via runtime debugging and community-induced knowledge.
  • Figure 3: Node-Level and Graph-Level Graph Reasoning.Left (Node-Level Relation-Aware Aggregation): For each neighbor $v_i \in \mathcal{N}_{\text{prune}}(v_t)$, relation analysis produces a structured annotation $\Gamma_{t,i}$ that categorizes implementation units into directly reusable ($\mathcal{U}^{\mathrm{reuse}}$), adaptable ($\mathcal{U}^{\mathrm{adapt}}$), and new ($\mathcal{U}^{\mathrm{new}}$); neighborhood aggregation selects among competing candidates by priority $p = w(v_t, v_i) + \beta \cdot \mathbb{1}[\text{reuse}]$ to construct $\mathcal{C}_{v_t}^{(\mathrm{init})}$. Right (Graph-Level Knowledge Induction): The paper graph is partitioned into subgraphs $\{\mathcal{G}^{(t)}_j\}$ via Louvain clustering on SSGP edge weights; within each subgraph, the agent reproduces member papers, performs cross-paper induction over execution feedback, and writes recurring patterns into the subgraph knowledge base $\mathcal{K}^{(t)}_j$.
  • Figure 4: Cross-model transferability of $\mathcal{K}_{\mathrm{col}}$. The induced knowledge base yields consistent gains when transferred across different backbone LLMs.
  • Figure 5: Subgraph partitioning comparison. Our edge-weight-based partitioning vs. random partitioning across training epochs.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Proposition B.1
  • Theorem B.3