DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Yunhai Hu; Junwei Zhou; Yumo Cao; Yitao Long; Yiwei Xu; Qiyi Jiang; Weiyao Wang; Xiaoyu Cao; Zhen Sun; Yiran Zou; Nan Du

DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Yunhai Hu, Junwei Zhou, Yumo Cao, Yitao Long, Yiwei Xu, Qiyi Jiang, Weiyao Wang, Xiaoyu Cao, Zhen Sun, Yiran Zou, Nan Du

Abstract

Effective retrieval in complex domains requires bridging the gap between structured metadata and unstructured content. Existing systems typically isolate these capabilities, relying on either symbolic filtering or vector similarity, failing to capture their interplay. In this work, we propose DSL-R1, a unified framework that synergizes logical reasoning with semantic matching via a novel Domain-Specific Language (DSL). By embedding vector primitives within SQL-style operators, our approach leverages the complementary strengths of symbolic precision and semantic coverage. We further introduce a reinforcement learning mechanism where rule-based execution feedback and retrieval quality rewards jointly optimize the DSL generation, balancing structural correctness and semantic alignment. Evaluations on a large-scale industrial email benchmark demonstrate that DSL-R1 achieves a +12.3% improvement in Hit@1/3, consistently outperforming decoupled baselines and establishing a robust paradigm for hybrid retrieval.

DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Abstract

Paper Structure (24 sections, 8 equations, 1 figure, 4 tables)

This paper contains 24 sections, 8 equations, 1 figure, 4 tables.

Introduction
Related Work
Retrieval
DSLs in AI Systems
Method
Overview
DSL Agent Design
Data Preparation
Reinforcement Learning
Reward Function Design
Experiments
Setup
Main Results
Ablation Studies
Conclusion
...and 9 more sections

Figures (1)

Figure 1: Overview of the reinforcement learning framework for DSL-based retrieval

DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Abstract

DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Authors

Abstract

Table of Contents

Figures (1)