Trustworthy and Efficient LLMs Meet Databases
Kyoungmin Kim, Anastasia Ailamaki
TL;DR
The tutorial addresses the dual challenges of making large language models trustworthy and efficient in database contexts, emphasizing hallucination reduction and reduced inference latency. It presents a structured three-part agenda—Trustworthy LLMs, Efficient LLMs, and LLMs Meet Databases—with detailed subtopics ranging from background theory to retrieval, memory, and system integration. Key contributions include surveys of methods such as retrieval augmented generation, memory and tool use, RLHF and DPO for alignment, KV caching and FlashAttention for efficiency, and adaptive scheduling for mixed relational-LLM workloads, as well as pathways toward integrated, hardware-aware DBMS–LLM systems. The significance lies in guiding database researchers and practitioners to deploy scalable, reliable LLM-driven data management by bridging concepts from LLM research with database techniques, benchmarks, and system design for real-world workloads.
Abstract
In the rapidly evolving AI era with large language models (LLMs) at the core, making LLMs more trustworthy and efficient, especially in output generation (inference), has gained significant attention. This is to reduce plausible but faulty LLM outputs (a.k.a hallucinations) and meet the highly increased inference demands. This tutorial explores such efforts and makes them transparent to the database community. Understanding these efforts is essential in harnessing LLMs in database tasks and adapting database techniques to LLMs. Furthermore, we delve into the synergy between LLMs and databases, highlighting new opportunities and challenges in their intersection. This tutorial aims to share with database researchers and practitioners essential concepts and strategies around LLMs, reduce the unfamiliarity of LLMs, and inspire joining in the intersection between LLMs and databases.
