MODOC: A Modular Interface for Flexible Interlinking of Text Retrieval and Text Generation Functions
Yingqiang Gao, Jhony Prada, Nianlong Gu, Jessica Lam, Richard H. R. Hahnloser
TL;DR
MODOC integrates retrieval and generation in a modular UI to enable trustworthy scientific writing with verifiable content. It comprises five modules and supports retrieval tasks (discovery, text alignment, keyphrase extraction) and generation tasks (citation and conclusion sentences), with explicit separation of truth-seeking and creative steps. The paper outlines structured workflows (Retrieve and Cite, Generate and Check) to promote ethical use and reduce confabulation. The platform aims to alleviate cognitive load while enabling real-time verification across millions of documents.
Abstract
Large Language Models (LLMs) produce eloquent texts but often the content they generate needs to be verified. Traditional information retrieval systems can assist with this task, but most systems have not been designed with LLM-generated queries in mind. As such, there is a compelling need for integrated systems that provide both retrieval and generation functionality within a single user interface. We present MODOC, a modular user interface that leverages the capabilities of LLMs and provides assistance with detecting their confabulations, promoting integrity in scientific writing. MODOC represents a significant step forward in scientific writing assistance. Its modular architecture supports flexible functions for retrieving information and for writing and generating text in a single, user-friendly interface.
