Making Databases Searchable with Deep Context
Alekh Jindal, Shi Qiao, Shivani Tripathi, Niloy Debnath, Kunal Singh, Pushpanjali Nema, Sharath Prakash, Aditya Halder, Ronith PR, Sadiq Mohammed, Abdul Hameed, Karan Hanswadkar, Ayush Kshitij, Sarthak Bhatt, Rony Chatterjee, Jyoti Pandey, Christina Pavlopoulou, Ravi Shetye
TL;DR
The paper presents Tursio, a system that makes enterprise databases searchable in natural language by constructing a semantic knowledge graph over diverse data sources and orchestrating NL query processing with LLMs across modeling, planning, and reasoning. It combines automatic data-model grounding, hash-based intent localization, multi-step NL query planning, and an interactive interface with explainability and governance features, delivering results directly in natural language. Extensive evaluation across production workloads and benchmarks (including BIRD-DEV, BEAVER, and TPC-H) shows high table-retrieval precision, robust join inference in well-structured schemas, and strong SQL-structure fidelity, while highlighting variability tied to schema regularity. The approach demonstrates tangible enterprise impact by enabling non-expert users to discover, reason about, and act on data through an accessible, secure, and auditable search experience, and outlines a roadmap for handling newer data modalities and agent-based workflows.
Abstract
Databases are the most critical assets for enterprises, and yet they remain largely inaccessible to people who make the most important decisions. In this paper, we describe the Tursio search platform that builds an abstraction layer, aka semantic knowledge graph, over the underlying databases to make them searchable in natural language. Tursio infuses large language models (LLMs) into every part of the query processing stack, including data modeling, query compilation, query planning, and result reasoning. This allows Tursio to process natural language queries systematically using techniques from traditional query planning and rewriting, rather than black-box memorization. We describe the architecture of Tursio in detail and present a comprehensive evaluation on production workloads, and synthetic and realistic benchmarks. Our results show that Tursio achieves high accuracy while being efficient and scalable, making databases truly searchable for non-expert users.
