Table of Contents
Fetching ...

Agentic Multi-Source Grounding for Enhanced Query Intent Understanding: A DoorDash Case Study

Emmanuel Aboah Boateng, Kyle MacDonald, Akshad Viswanathan, Sudeep Das

TL;DR

An Agentic Multi-Source Grounded system that addresses both failure modes by grounding LLM inference in a staged catalog entity retrieval pipeline and an agentic web-search tool invoked autonomously for cold-start queries, and establishes a generalizable paradigm for applications requiring foundation models grounded in proprietary context and real-time web knowledge to resolve ambiguous, context-sparse decision problems at scale.

Abstract

Accurately mapping user queries to business categories is a fundamental Information Retrieval challenge for multi-category marketplaces, where context-sparse queries such as "Wildflower" exhibit intent ambiguity, simultaneously denoting a restaurant chain, a retail product, and a floral item. Traditional classifiers force a winner-takes-all assignment, while general-purpose LLMs hallucinate unavailable inventory. We introduce an Agentic Multi-Source Grounded system that addresses both failure modes by grounding LLM inference in (i) a staged catalog entity retrieval pipeline and (ii) an agentic web-search tool invoked autonomously for cold-start queries. Rather than predicting a single label, the model emits an ordered multi-intent set, resolved by a configurable disambiguation layer that applies deterministic business policies and is designed for extensibility to personalization signals. This decoupled design generalizes across domains, allowing any marketplace to supply its own grounding sources and resolution rules without modifying the core architecture. Evaluated on DoorDash's multi-vertical search platform, the system achieves +10.9pp over the ungrounded LLM baseline and +4.6pp over the legacy production system. On long-tail queries, incremental ablations attribute +8.3pp to catalog grounding, +3.2pp to agentic web search grounding, and +1.5pp to dual intent disambiguation, yielding 90.7% accuracy (+13.0pp over baseline). The system is deployed in production, serving over 95% of daily search impressions, and establishes a generalizable paradigm for applications requiring foundation models grounded in proprietary context and real-time web knowledge to resolve ambiguous, context-sparse decision problems at scale.

Agentic Multi-Source Grounding for Enhanced Query Intent Understanding: A DoorDash Case Study

TL;DR

An Agentic Multi-Source Grounded system that addresses both failure modes by grounding LLM inference in a staged catalog entity retrieval pipeline and an agentic web-search tool invoked autonomously for cold-start queries, and establishes a generalizable paradigm for applications requiring foundation models grounded in proprietary context and real-time web knowledge to resolve ambiguous, context-sparse decision problems at scale.

Abstract

Accurately mapping user queries to business categories is a fundamental Information Retrieval challenge for multi-category marketplaces, where context-sparse queries such as "Wildflower" exhibit intent ambiguity, simultaneously denoting a restaurant chain, a retail product, and a floral item. Traditional classifiers force a winner-takes-all assignment, while general-purpose LLMs hallucinate unavailable inventory. We introduce an Agentic Multi-Source Grounded system that addresses both failure modes by grounding LLM inference in (i) a staged catalog entity retrieval pipeline and (ii) an agentic web-search tool invoked autonomously for cold-start queries. Rather than predicting a single label, the model emits an ordered multi-intent set, resolved by a configurable disambiguation layer that applies deterministic business policies and is designed for extensibility to personalization signals. This decoupled design generalizes across domains, allowing any marketplace to supply its own grounding sources and resolution rules without modifying the core architecture. Evaluated on DoorDash's multi-vertical search platform, the system achieves +10.9pp over the ungrounded LLM baseline and +4.6pp over the legacy production system. On long-tail queries, incremental ablations attribute +8.3pp to catalog grounding, +3.2pp to agentic web search grounding, and +1.5pp to dual intent disambiguation, yielding 90.7% accuracy (+13.0pp over baseline). The system is deployed in production, serving over 95% of daily search impressions, and establishes a generalizable paradigm for applications requiring foundation models grounded in proprietary context and real-time web knowledge to resolve ambiguous, context-sparse decision problems at scale.
Paper Structure (11 sections, 7 equations, 4 figures, 2 tables)

This paper contains 11 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Intent Ambiguity for the query "Wildflower." (Left) The Restaurant category surfaces the fictional Wildflower Bites chain. (Right) The Retail category surfaces literal flower products. Our system captures both interpretations to enable context-aware routing.
  • Figure 2: System Architecture Overview. The pipeline illustrates the multi-source evidence retrieval process (steps 2--4, 6), the dual-intent reasoning engine (step 5), and the pluggable disambiguation layer (step 7) that populates the production cache.
  • Figure 3: SOT Accuracy by Segment.
  • Figure 4: Component Ablation on SOT Tail ($N{=}4{,}993$). Cumulative +13.0pp lift over the ungrounded baseline.