Table of Contents
Fetching ...

UniFAR: A Unified Facet-Aware Retrieval Framework for Scientific Documents

Zheng Dou, Zhao Zhang, Deqing Wang, Yikun Ban, Fuzhen Zhuang

TL;DR

UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable facet anchors, and unifies doc-doc and q-doc supervision through joint training, confirming its effectiveness and generality.

Abstract

Existing scientific document retrieval (SDR) methods primarily rely on document-centric representations learned from inter-document relationships for document-document (doc-doc) retrieval. However, the rise of LLMs and RAG has shifted SDR toward question-driven retrieval, where documents are retrieved in response to natural-language questions (q-doc). This change has led to systematic mismatches between document-centric models and question-driven retrieval, including (1) input granularity (long documents vs. short questions), (2) semantic focus (scientific discourse structure vs. specific question intent), and (3) training signals (citation-based similarity vs. question-oriented relevance). To this end, we propose UniFAR, a Unified Facet-Aware Retrieval framework to jointly support doc-doc and q-doc SDR within a single architecture. UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable facet anchors, and unifies doc-doc and q-doc supervision through joint training. Experimental results show that UniFAR consistently outperforms prior methods across multiple retrieval tasks and base models, confirming its effectiveness and generality.

UniFAR: A Unified Facet-Aware Retrieval Framework for Scientific Documents

TL;DR

UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable facet anchors, and unifies doc-doc and q-doc supervision through joint training, confirming its effectiveness and generality.

Abstract

Existing scientific document retrieval (SDR) methods primarily rely on document-centric representations learned from inter-document relationships for document-document (doc-doc) retrieval. However, the rise of LLMs and RAG has shifted SDR toward question-driven retrieval, where documents are retrieved in response to natural-language questions (q-doc). This change has led to systematic mismatches between document-centric models and question-driven retrieval, including (1) input granularity (long documents vs. short questions), (2) semantic focus (scientific discourse structure vs. specific question intent), and (3) training signals (citation-based similarity vs. question-oriented relevance). To this end, we propose UniFAR, a Unified Facet-Aware Retrieval framework to jointly support doc-doc and q-doc SDR within a single architecture. UniFAR reconciles granularity differences through adaptive multi-granularity aggregation, aligns document structure with question intent via learnable facet anchors, and unifies doc-doc and q-doc supervision through joint training. Experimental results show that UniFAR consistently outperforms prior methods across multiple retrieval tasks and base models, confirming its effectiveness and generality.
Paper Structure (20 sections, 9 equations, 4 figures, 9 tables)

This paper contains 20 sections, 9 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: The architecture of the UniFAR framework. Scientific documents and questions are formatted as a unified sequence of segmented sentences and encoded by the Replaceable Base Model (gray) into token-level and sentence-level embeddings. Facet Aggregation Module (blue) builds on learnable facet anchors to produce input-specific facet queries, which guide the aggregation through multi-granularity attention to generate facet-level embeddings. Facet embeddings are used to compute a similarity matrix between query and candidate for each facet, supporting facet-aware retrieval.
  • Figure 2: The facet-aware joint training strategy of UniFAR. Facet-Aware Training Unit (FTU, left) is constructed via LLM-based facet labeling and question generation, containing a query document, its facet-specific positives and negatives, and facet-aware scientific questions. FTUs are then forwarded through UniFAR (middle) as described in Section \ref{['sec:framework']} to produce attention maps and facet-level document/question embeddings for Joint Optimization (right). Three objectives: (1) $\mathcal{L}_\textit{DD}$ for doc–doc alignment, (2) $\mathcal{L}_\textit{QD}$ for q–doc retrieval, and (3) $\mathcal{L}_\textit{KL}$ for facet-sentence attention consistency, are jointly optimized through backpropagation to update all trainable components of UniFAR.
  • Figure 3: Facet-level similarity matrices between question facets (rows: Q-bg, Q-mt, Q-rs) and document facets (columns: Doc-bg, Doc-mt, Doc-rs).
  • Figure 4: Facet-level attention visualizations for the scientific question and three retrieved documents in Table \ref{['tab:case-sample']}. Each heatmap shows the attention weights from the three facet queries $\mathbf{Q}_{facet}$ to question tokens or document sentences during multi-granularity aggregation. Higher weights (darker colors) indicate greater contribution of a sentence or token to the final facet embeddings.