Table of Contents
Fetching ...

Conversational Assistants in Knowledge-Intensive Contexts: An Evaluation of LLM- versus Intent-based Systems

Samuel Kernan Freire, Chaofan Wang, Evangelos Niforatos

TL;DR

This paper investigates whether LLM-based conversational assistants offer tangible advantages over traditional intent-based CAs for knowledge management tasks in workplace contexts. It conducts a between-group lab study in which participants perform eight KM-related tasks using either an intent-based CA or an LLM-based CA with Retrieval-Augmented Generation, measuring interaction efficiency, usability, workload, and user experience. The results show that LLM-based CAs achieve higher task completion rates, better SUS and UEQ scores, and a more favorable perceived performance, with similar task times; however, hallucination risk remains a concern. Overall, the work provides empirical support for adopting LLM-based NLP in KM while highlighting practical safety, reliability, and design considerations for real-world deployment.

Abstract

Conversational Assistants (CA) are increasingly supporting human workers in knowledge management. Traditionally, CAs respond in specific ways to predefined user intents and conversation patterns. However, this rigidness does not handle the diversity of natural language well. Recent advances in natural language processing, namely Large Language Models (LLMs), enable CAs to converse in a more flexible, human-like manner, extracting relevant information from texts and capturing information from expert humans but introducing new challenges such as ``hallucinations''. To assess the potential of using LLMs for knowledge management tasks, we conducted a user study comparing an LLM-based CA to an intent-based system regarding interaction efficiency, user experience, workload, and usability. This revealed that LLM-based CAs exhibited better user experience, task completion rate, usability, and perceived performance than intent-based systems, suggesting that switching NLP techniques can be beneficial in the context of knowledge management.

Conversational Assistants in Knowledge-Intensive Contexts: An Evaluation of LLM- versus Intent-based Systems

TL;DR

This paper investigates whether LLM-based conversational assistants offer tangible advantages over traditional intent-based CAs for knowledge management tasks in workplace contexts. It conducts a between-group lab study in which participants perform eight KM-related tasks using either an intent-based CA or an LLM-based CA with Retrieval-Augmented Generation, measuring interaction efficiency, usability, workload, and user experience. The results show that LLM-based CAs achieve higher task completion rates, better SUS and UEQ scores, and a more favorable perceived performance, with similar task times; however, hallucination risk remains a concern. Overall, the work provides empirical support for adopting LLM-based NLP in KM while highlighting practical safety, reliability, and design considerations for real-world deployment.

Abstract

Conversational Assistants (CA) are increasingly supporting human workers in knowledge management. Traditionally, CAs respond in specific ways to predefined user intents and conversation patterns. However, this rigidness does not handle the diversity of natural language well. Recent advances in natural language processing, namely Large Language Models (LLMs), enable CAs to converse in a more flexible, human-like manner, extracting relevant information from texts and capturing information from expert humans but introducing new challenges such as ``hallucinations''. To assess the potential of using LLMs for knowledge management tasks, we conducted a user study comparing an LLM-based CA to an intent-based system regarding interaction efficiency, user experience, workload, and usability. This revealed that LLM-based CAs exhibited better user experience, task completion rate, usability, and perceived performance than intent-based systems, suggesting that switching NLP techniques can be beneficial in the context of knowledge management.
Paper Structure (18 sections, 4 figures, 1 table)

This paper contains 18 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: (Conversational) User Interfaces (UIs) of (a) the Intent-based and (b) the LLM-based congitive assistants.
  • Figure 2: Task time (a), Task completion rate* (b), and System usability score* (c) between the Intent and LLM groups
  • Figure 3: User Experience (UEQ scores) between the LLM and Intent groups
  • Figure 4: Workload (NASA-TLX) between the LLM and Intent conditions