Table of Contents
Fetching ...

MutaBot: A Mutation Testing Approach for Chatbots

Michael Ferdinando Urrico, Diego Clerissi, Leonardo Mariani

TL;DR

The paper tackles the absence of mutation testing for conversational chatbots by introducing MutaBot, a platform-agnostic mutation testing framework that targets core chatbot elements such as intents, flows, entities, and contexts. It defines a meta-model and 24 mutation operators, implemented in a Dialogflow prototype, coordinated by a MutationManager with platform-specific adapters and XML configurations. An empirical evaluation on three Dialogflow chatbots using Botium test generation shows that only 28–37% of mutants are killed, with structure-related mutations more detectable than flows or parameter-related mutations, underscoring gaps in current test coverage. The work argues for extending MutaBot to additional platforms (e.g., Rasa, Lex) and enriching the operator set, enabling large-scale, fault-based assessment of chatbot test suites and guiding improvements in test-case generation tools.

Abstract

Mutation testing is a technique aimed at assessing the effectiveness of test suites by seeding artificial faults into programs. Although available for many platforms and languages, no mutation testing tool is currently available for conversational chatbots, which represent an increasingly popular solution to design systems that can interact with users through a natural language interface. Note that since conversations must be explicitly engineered by the developers of conversational chatbots, these systems are exposed to specific types of faults not supported by existing mutation testing tools. In this paper, we present MutaBot, a mutation testing tool for conversational chatbots. MutaBot addresses mutations at multiple levels, including conversational flows, intents, and contexts. We designed the tool to potentially target multiple platforms, while we implemented initial support for Google Dialogflow chatbots. We assessed the tool with three Dialogflow chatbots and test cases generated with Botium, revealing weaknesses in the test suites.

MutaBot: A Mutation Testing Approach for Chatbots

TL;DR

The paper tackles the absence of mutation testing for conversational chatbots by introducing MutaBot, a platform-agnostic mutation testing framework that targets core chatbot elements such as intents, flows, entities, and contexts. It defines a meta-model and 24 mutation operators, implemented in a Dialogflow prototype, coordinated by a MutationManager with platform-specific adapters and XML configurations. An empirical evaluation on three Dialogflow chatbots using Botium test generation shows that only 28–37% of mutants are killed, with structure-related mutations more detectable than flows or parameter-related mutations, underscoring gaps in current test coverage. The work argues for extending MutaBot to additional platforms (e.g., Rasa, Lex) and enriching the operator set, enabling large-scale, fault-based assessment of chatbot test suites and guiding improvements in test-case generation tools.

Abstract

Mutation testing is a technique aimed at assessing the effectiveness of test suites by seeding artificial faults into programs. Although available for many platforms and languages, no mutation testing tool is currently available for conversational chatbots, which represent an increasingly popular solution to design systems that can interact with users through a natural language interface. Note that since conversations must be explicitly engineered by the developers of conversational chatbots, these systems are exposed to specific types of faults not supported by existing mutation testing tools. In this paper, we present MutaBot, a mutation testing tool for conversational chatbots. MutaBot addresses mutations at multiple levels, including conversational flows, intents, and contexts. We designed the tool to potentially target multiple platforms, while we implemented initial support for Google Dialogflow chatbots. We assessed the tool with three Dialogflow chatbots and test cases generated with Botium, revealing weaknesses in the test suites.
Paper Structure (7 sections, 2 figures, 2 tables)

This paper contains 7 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Chatbot structure meta-model adapted from Cañizares et al.canizares2022automating.
  • Figure 2: MutaBot architecture.