Table of Contents
Fetching ...

BanglAssist: A Bengali-English Generative AI Chatbot for Code-Switching and Dialect-Handling in Customer Service

Francesco Kruk, Savindu Herath, Prithwiraj Choudhury

TL;DR

BanglAssist addresses the pressing challenge of deploying GenAI chatbots in multilingual, code-switched customer service contexts by combining retrieval-augmented generation with targeted prompt engineering. Using a GPT-4o backbone and a two-stage retrieval pipeline grounded in a curated FAQ set, the system translates multilingual queries to English for robust retrieval and then generates contextually accurate, language- and script-consistent responses with minimal hallucination. Evaluation on 20 queries demonstrates strong language matching and respectable retrieval and generation performance ($\text{acc}=$$0.81$ overall), while revealing trade-offs between reranking benefits and precision. The work contributes to inclusive HCI by enabling more accurate, culturally aware, and scalable multilingual customer service experiences, and points to future extensions across additional languages and dialects.$

Abstract

In recent years, large language models (LLMs) have demonstrated exponential improvements that promise transformative opportunities across various industries. Their ability to generate human-like text and ensure continuous availability facilitates the creation of interactive service chatbots aimed at enhancing customer experience and streamlining enterprise operations. Despite their potential, LLMs face critical challenges, such as a susceptibility to hallucinations and difficulties handling complex linguistic scenarios, notably code switching and dialectal variations. To address these challenges, this paper describes the design of a multilingual chatbot for Bengali-English customer service interactions utilizing retrieval-augmented generation (RAG) and targeted prompt engineering. This research provides valuable insights for the human-computer interaction (HCI) community, emphasizing the importance of designing systems that accommodate linguistic diversity to benefit both customers and businesses. By addressing the intersection of generative AI and cultural heterogeneity, this late-breaking work inspires future innovations in multilingual and multicultural HCI.

BanglAssist: A Bengali-English Generative AI Chatbot for Code-Switching and Dialect-Handling in Customer Service

TL;DR

BanglAssist addresses the pressing challenge of deploying GenAI chatbots in multilingual, code-switched customer service contexts by combining retrieval-augmented generation with targeted prompt engineering. Using a GPT-4o backbone and a two-stage retrieval pipeline grounded in a curated FAQ set, the system translates multilingual queries to English for robust retrieval and then generates contextually accurate, language- and script-consistent responses with minimal hallucination. Evaluation on 20 queries demonstrates strong language matching and respectable retrieval and generation performance ( overall), while revealing trade-offs between reranking benefits and precision. The work contributes to inclusive HCI by enabling more accurate, culturally aware, and scalable multilingual customer service experiences, and points to future extensions across additional languages and dialects.$

Abstract

In recent years, large language models (LLMs) have demonstrated exponential improvements that promise transformative opportunities across various industries. Their ability to generate human-like text and ensure continuous availability facilitates the creation of interactive service chatbots aimed at enhancing customer experience and streamlining enterprise operations. Despite their potential, LLMs face critical challenges, such as a susceptibility to hallucinations and difficulties handling complex linguistic scenarios, notably code switching and dialectal variations. To address these challenges, this paper describes the design of a multilingual chatbot for Bengali-English customer service interactions utilizing retrieval-augmented generation (RAG) and targeted prompt engineering. This research provides valuable insights for the human-computer interaction (HCI) community, emphasizing the importance of designing systems that accommodate linguistic diversity to benefit both customers and businesses. By addressing the intersection of generative AI and cultural heterogeneity, this late-breaking work inspires future innovations in multilingual and multicultural HCI.

Paper Structure

This paper contains 21 sections, 3 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Pipeline of the multilingual customer service chatbot BanglAssist
  • Figure 2: 3D principal component analysis representation of two sentences, embedded in 3 different linguistic variations
  • Figure 3: Prompt used to generate BanglAssist's replies
  • Figure 4: Screenshots of the chatbot implementation in Streamlit: (a) Home screen of the chatbot, showing three FAQs; (b) Answer printed from the FAQ database; (c) Answer generated through GPT-4o based on the user question and the context retrieved from the FAQ database