Table of Contents
Fetching ...

Automatic Generation of Conversational Interfaces for Tabular Data Analysis

Marcos Gomez-Vazquez, Jordi Cabot, Robert Clarisó

TL;DR

The paper tackles the barrier of accessing tabular data by presenting a no-code pipeline that auto-generates conversational chatbots from data schemas. It combines an intent- and entity-based architecture with a fallback LLM-assisted English-to-SQL path, all orchestrated by the DataBot platform to deliver table and chart outputs. The approach automatically infers and optionally enriches data schemas, generates intents and entities, and deploys fully functional bots, addressing both accuracy and scalability concerns in open data contexts. This hybrid solution aims to democratize data exploration for non-technical users while maintaining safe, explainable analytics suitable for public administrations. The work also outlines future directions, including ontology-driven semantic enrichment and expanded data-source support, and provides an open-source implementation.

Abstract

Tabular data is the most common format to publish and exchange structured data online. A clear example is the growing number of open data portals published by public administrations. However, exploitation of these data sources is currently limited to technical people able to programmatically manipulate and digest such data. As an alternative, we propose the use of chatbots to offer a conversational interface to facilitate the exploration of tabular data sources, including support for data analytics questions that are responded via charts rendered by the chatbot. Moreover, our chatbots are automatically generated from the data source itself thanks to the instantiation of a configurable collection of conversation patterns matched to the chatbot intents and entities.

Automatic Generation of Conversational Interfaces for Tabular Data Analysis

TL;DR

The paper tackles the barrier of accessing tabular data by presenting a no-code pipeline that auto-generates conversational chatbots from data schemas. It combines an intent- and entity-based architecture with a fallback LLM-assisted English-to-SQL path, all orchestrated by the DataBot platform to deliver table and chart outputs. The approach automatically infers and optionally enriches data schemas, generates intents and entities, and deploys fully functional bots, addressing both accuracy and scalability concerns in open data contexts. This hybrid solution aims to democratize data exploration for non-technical users while maintaining safe, explainable analytics suitable for public administrations. The work also outlines future directions, including ontology-driven semantic enrichment and expanded data-source support, and provides an open-source implementation.

Abstract

Tabular data is the most common format to publish and exchange structured data online. A clear example is the growing number of open data portals published by public administrations. However, exploitation of these data sources is currently limited to technical people able to programmatically manipulate and digest such data. As an alternative, we propose the use of chatbots to offer a conversational interface to facilitate the exploration of tabular data sources, including support for data analytics questions that are responded via charts rendered by the chatbot. Moreover, our chatbots are automatically generated from the data source itself thanks to the instantiation of a configurable collection of conversation patterns matched to the chatbot intents and entities.
Paper Structure (14 sections, 5 figures)

This paper contains 14 sections, 5 figures.

Figures (5)

  • Figure 1: Diagram of the architecture of the generated chatbots.
  • Figure 2: The chatbot generation process.
  • Figure 3: Screenshot of the admin User Interface, where bots can be executed and the data schemas can be enhanced.
  • Figure 4: Screenshot of the interactive dashboard showing a graphic answer (a histogram) generated by the bot.
  • Figure 5: Screenshot of the interactive dashboard showing a tabular answer generated by the bot through the fallback mechanism powered by GPT-4. On the top-right side, there is an information box indicating that the displayed answer has been obtained after running an AI-generated SQL statement, and the actual SQL is also shown.