Table of Contents
Fetching ...

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Le Lin, Zihao Zhu, Rainbow Tin Hung Ho, Jing Liao, Yuhan Luo

TL;DR

An MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity.

Abstract

Therapeutic art activities, such as expressive drawing and painting, require the synergy between creative visual production and interactive dialogue. Recent advancements in Multimodal Large Language Models (MLLMs) have expanded the capacity of computing systems to interpret both textual and visual data, offering a new frontier for AI-mediated therapeutic support. This work-in-progress paper introduces an MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations. We conducted an evaluation with five experts in art therapy and related fields, which demonstrated the chatbot's potential to facilitate therapeutic engagement, and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity. These themes provide a design roadmap for designing the future AI-mediated creative expression tools.

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

TL;DR

An MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity.

Abstract

Therapeutic art activities, such as expressive drawing and painting, require the synergy between creative visual production and interactive dialogue. Recent advancements in Multimodal Large Language Models (MLLMs) have expanded the capacity of computing systems to interpret both textual and visual data, offering a new frontier for AI-mediated therapeutic support. This work-in-progress paper introduces an MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations. We conducted an evaluation with five experts in art therapy and related fields, which demonstrated the chatbot's potential to facilitate therapeutic engagement, and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity. These themes provide a design roadmap for designing the future AI-mediated creative expression tools.
Paper Structure (12 sections, 2 figures, 1 table)

This paper contains 12 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: An overview of the system architecture. The image understanding module takes the client's drawing and the designed prompts as input, producing an image summary that the conversation module uses to generate responses.
  • Figure 2: The artworks created by the expert participants using our system during the study.