Table of Contents
Fetching ...

Trustworthy and Efficient LLMs Meet Databases

Kyoungmin Kim, Anastasia Ailamaki

TL;DR

The tutorial addresses the dual challenges of making large language models trustworthy and efficient in database contexts, emphasizing hallucination reduction and reduced inference latency. It presents a structured three-part agenda—Trustworthy LLMs, Efficient LLMs, and LLMs Meet Databases—with detailed subtopics ranging from background theory to retrieval, memory, and system integration. Key contributions include surveys of methods such as retrieval augmented generation, memory and tool use, RLHF and DPO for alignment, KV caching and FlashAttention for efficiency, and adaptive scheduling for mixed relational-LLM workloads, as well as pathways toward integrated, hardware-aware DBMS–LLM systems. The significance lies in guiding database researchers and practitioners to deploy scalable, reliable LLM-driven data management by bridging concepts from LLM research with database techniques, benchmarks, and system design for real-world workloads.

Abstract

In the rapidly evolving AI era with large language models (LLMs) at the core, making LLMs more trustworthy and efficient, especially in output generation (inference), has gained significant attention. This is to reduce plausible but faulty LLM outputs (a.k.a hallucinations) and meet the highly increased inference demands. This tutorial explores such efforts and makes them transparent to the database community. Understanding these efforts is essential in harnessing LLMs in database tasks and adapting database techniques to LLMs. Furthermore, we delve into the synergy between LLMs and databases, highlighting new opportunities and challenges in their intersection. This tutorial aims to share with database researchers and practitioners essential concepts and strategies around LLMs, reduce the unfamiliarity of LLMs, and inspire joining in the intersection between LLMs and databases.

Trustworthy and Efficient LLMs Meet Databases

TL;DR

The tutorial addresses the dual challenges of making large language models trustworthy and efficient in database contexts, emphasizing hallucination reduction and reduced inference latency. It presents a structured three-part agenda—Trustworthy LLMs, Efficient LLMs, and LLMs Meet Databases—with detailed subtopics ranging from background theory to retrieval, memory, and system integration. Key contributions include surveys of methods such as retrieval augmented generation, memory and tool use, RLHF and DPO for alignment, KV caching and FlashAttention for efficiency, and adaptive scheduling for mixed relational-LLM workloads, as well as pathways toward integrated, hardware-aware DBMS–LLM systems. The significance lies in guiding database researchers and practitioners to deploy scalable, reliable LLM-driven data management by bridging concepts from LLM research with database techniques, benchmarks, and system design for real-world workloads.

Abstract

In the rapidly evolving AI era with large language models (LLMs) at the core, making LLMs more trustworthy and efficient, especially in output generation (inference), has gained significant attention. This is to reduce plausible but faulty LLM outputs (a.k.a hallucinations) and meet the highly increased inference demands. This tutorial explores such efforts and makes them transparent to the database community. Understanding these efforts is essential in harnessing LLMs in database tasks and adapting database techniques to LLMs. Furthermore, we delve into the synergy between LLMs and databases, highlighting new opportunities and challenges in their intersection. This tutorial aims to share with database researchers and practitioners essential concepts and strategies around LLMs, reduce the unfamiliarity of LLMs, and inspire joining in the intersection between LLMs and databases.

Paper Structure

This paper contains 24 sections, 1 figure.

Figures (1)

  • Figure 1: Tutorial outline (each subsection with keywords).