Table of Contents
Fetching ...

Genicious: Contextual Few-shot Prompting for Insights Discovery

Vineet Kumar, Ronald Tony, Darshita Rathore, Vipasha Rana, Bhuvanesh Mandora, Kanishka, Chetna Bansal, Anindya Moitra

TL;DR

Genicious tackles dynamic insights discovery over relational data without exposing data to external LLMs. It uses a Text-to-SQL approach that maps a natural-language query $Q$ and schema $S$ to an executable SQL query $y$ via the conditional model $P(y|\mathcal{T}(Q,S))=\prod_{i=1}^{|y|}P(y_i|\mathcal{T}(Q,S),y_{<i})$. The key contributions are (a) benchmarking open-source and proprietary LLMs for SQL generation, (b) Contextual Few-shot Prompting with Retrieval-Augmented Generation to supply dynamic demonstrations, and (c) an end-to-end system with offline onboarding and online query processing using a Milvus vector store and FAISS-derived similarity search. Empirical results show improved accuracy and latency on Spider and domain-specific data, with GPT-3.5 Turbo delivering strong domain performance while maintaining data confidentiality.

Abstract

Data and insights discovery is critical for decision-making in modern organizations. We present Genicious, an LLM-aided interface that enables users to interact with tabular datasets and ask complex queries in natural language. By benchmarking various prompting strategies and language models, we have developed an end-to-end tool that leverages contextual few-shot prompting, achieving superior performance in terms of latency, accuracy, and scalability. Genicious empowers stakeholders to explore, analyze and visualize their datasets efficiently while ensuring data security through role-based access control and a Text-to-SQL approach.

Genicious: Contextual Few-shot Prompting for Insights Discovery

TL;DR

Genicious tackles dynamic insights discovery over relational data without exposing data to external LLMs. It uses a Text-to-SQL approach that maps a natural-language query and schema to an executable SQL query via the conditional model . The key contributions are (a) benchmarking open-source and proprietary LLMs for SQL generation, (b) Contextual Few-shot Prompting with Retrieval-Augmented Generation to supply dynamic demonstrations, and (c) an end-to-end system with offline onboarding and online query processing using a Milvus vector store and FAISS-derived similarity search. Empirical results show improved accuracy and latency on Spider and domain-specific data, with GPT-3.5 Turbo delivering strong domain performance while maintaining data confidentiality.

Abstract

Data and insights discovery is critical for decision-making in modern organizations. We present Genicious, an LLM-aided interface that enables users to interact with tabular datasets and ask complex queries in natural language. By benchmarking various prompting strategies and language models, we have developed an end-to-end tool that leverages contextual few-shot prompting, achieving superior performance in terms of latency, accuracy, and scalability. Genicious empowers stakeholders to explore, analyze and visualize their datasets efficiently while ensuring data security through role-based access control and a Text-to-SQL approach.

Paper Structure

This paper contains 11 sections, 1 equation, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Number of Examples$(k)$ in Context vs. Accuracy
  • Figure 2: Performance of LLMs wrt. difficulty level on Spider yu2018spider test set.
  • Figure 3: Tool Architecture