Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB
Anas Dorbani, Sunny Yasser, Jimmy Lin, Amine Mhedhbi
TL;DR
Knowledge-intensive analytics require integrating structured data with unstructured context, but building end-to-end LLM-driven pipelines across DBMSs is labor-intensive. FlockMTL embeds LLM capabilities and retrieval-augmented generation directly into DuckDB as an extension, providing MODEL and PROMPT resources and a library of scalar and aggregate functions to compose LLM-based predictions in SQL. Core contributions include resource-based abstractions with versioning, a rich function set with specialized wrappers for filtering and reranking, and cost-based optimizations such as meta-prompting and batching that reduce data movement and latency. The approach enables a unified, end-to-end platform for semantic analytics, delivering easier deployment of knowledge-intensive workloads across formats and federations, with demonstrable end-to-end NL-to-SQL workflows and full hybrid search inside a DBMS.
Abstract
Knowledge-intensive analytical applications retrieve context from both structured tabular data and unstructured, text-free documents for effective decision-making. Large language models (LLMs) have made it significantly easier to prototype such retrieval and reasoning data pipelines. However, implementing these pipelines efficiently still demands significant effort and has several challenges. This often involves orchestrating heterogeneous data systems, managing data movement, and handling low-level implementation details, e.g., LLM context management. To address these challenges, we introduce FlockMTL: an extension for DBMSs that deeply integrates LLM capabilities and retrieval-augmented generation (RAG). FlockMTL includes model-driven scalar and aggregate functions, enabling chained predictions through tuple-level mappings and reductions. Drawing inspiration from the relational model, FlockMTL incorporates: (i) cost-based optimizations, which seamlessly apply techniques such as batching and caching; and (ii) resource independence, enabled through novel SQL DDL abstractions: PROMPT and MODEL, introduced as first-class schema objects alongside TABLE. FlockMTL streamlines the development of knowledge-intensive analytical applications, and its optimizations ease the implementation burden.
