Semantic Decomposition and Selective Context Filtering -- Text Processing Techniques for Context-Aware NLP-Based Systems
Karl John Villardar
TL;DR
The paper addresses the challenge of integrating LLMs into context-rich, real-world workflows by identifying core limitations such as limited context windows, memory constraints, and hallucinations. It introduces two techniques, Semantic Decomposition and Selective Context Filtering, to impose structured prompt schemas and prune irrelevant context, thereby enabling more coherent, context-aware interactions within LLM pipelines and leveraging existing approaches like RAG, CoT, and Structured Outputs. A formal evaluation framework using synthetic datasets SynPrompt and SynAsst, plus the Exponential Consistency Index (ECI) defined as $ \bar{S} =\frac{1}{d} \sum_{i=1}^{d} \left(\frac{\hat{k_i}}{k}\right)^\alpha$, assesses consistency under varying decomposition depth and filtering strategies. Experimental results reveal trade-offs between depth of decomposition, context exposure, and filtering strategy, with vector-based filtering often outperforming LLM-based methods and smaller models occasionally delivering more stable performance. Overall, the work provides a practical pathway to improve LLM-to-system interfaces for domains requiring dynamic, context-sensitive responses and efficient workflow automation.
Abstract
In this paper, we present two techniques for use in context-aware systems: Semantic Decomposition, which sequentially decomposes input prompts into a structured and hierarchal information schema in which systems can parse and process easily, and Selective Context Filtering, which enables systems to systematically filter out specific irrelevant sections of contextual information that is fed through a system's NLP-based pipeline. We will explore how context-aware systems and applications can utilize these two techniques in order to implement dynamic LLM-to-system interfaces, improve an LLM's ability to generate more contextually cohesive user-facing responses, and optimize complex automated workflows and pipelines.
