Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering

Hai Huang

Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering

Hai Huang

TL;DR

The paper tackles the challenge of selecting, compressing, and diversifying context for LLMs under budgets by introducing Directed Information γ-covering, a principled, query-agnostic framework that leverages directional predictive relationships among context chunks. It defines γ-covering via DI, develops a greedy submodular optimization with strong approximation guarantees, and establishes soundness and diversity properties, all while enabling offline precomputation that amortizes online cost. Empirically, the approach improves context compression, system prompt selection, and reranking on HotpotQA, with DIG-R diffusion-based reranking showing consistent gains when integrated with a strong retriever. The work demonstrates that self-organizing information-theoretic principles can stabilize and improve modern LLM pipelines, particularly under hard decision regimes and tight budgets, and points to future exploration in redundancy-rich and long-context settings.

Abstract

We introduce \textbf{Directed Information $γ$-covering}, a simple but general framework for redundancy-aware context engineering. Directed information (DI), a causal analogue of mutual information, measures asymmetric predictiveness between chunks. If $\operatorname{DI}_{i \to j} \ge H(C_j) - γ$, then $C_i$ suffices to represent $C_j$ up to $γ$ bits. Building on this criterion, we formulate context selection as a $γ$-cover problem and propose a greedy algorithm with provable guarantees: it preserves query information within bounded slack, inherits $(1+\ln n)$ and $(1-1/e)$ approximations from submodular set cover, and enforces a diversity margin. Importantly, building the $γ$-cover is \emph{query-agnostic}: it incurs no online cost and can be computed once offline and amortized across all queries. Experiments on HotpotQA show that $γ$-covering consistently improves over BM25, a competitive baseline, and provides clear advantages in hard-decision regimes such as context compression and single-slot prompt selection. These results establish DI $γ$-covering as a principled, self-organizing backbone for modern LLM pipelines.

Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering

TL;DR

Abstract

We introduce \textbf{Directed Information

-covering}, a simple but general framework for redundancy-aware context engineering. Directed information (DI), a causal analogue of mutual information, measures asymmetric predictiveness between chunks. If

, then

suffices to represent

up to

bits. Building on this criterion, we formulate context selection as a

-cover problem and propose a greedy algorithm with provable guarantees: it preserves query information within bounded slack, inherits

and

approximations from submodular set cover, and enforces a diversity margin. Importantly, building the

-cover is \emph{query-agnostic}: it incurs no online cost and can be computed once offline and amortized across all queries. Experiments on HotpotQA show that

-covering consistently improves over BM25, a competitive baseline, and provides clear advantages in hard-decision regimes such as context compression and single-slot prompt selection. These results establish DI

-covering as a principled, self-organizing backbone for modern LLM pipelines.

Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering

TL;DR

Abstract

Directed Information $γ$-covering: An Information-Theoretic Framework for Context Engineering

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (22)