Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Le Lin; Zihao Zhu; Rainbow Tin Hung Ho; Jing Liao; Yuhan Luo

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Le Lin, Zihao Zhu, Rainbow Tin Hung Ho, Jing Liao, Yuhan Luo

TL;DR

An MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity.

Abstract

Therapeutic art activities, such as expressive drawing and painting, require the synergy between creative visual production and interactive dialogue. Recent advancements in Multimodal Large Language Models (MLLMs) have expanded the capacity of computing systems to interpret both textual and visual data, offering a new frontier for AI-mediated therapeutic support. This work-in-progress paper introduces an MLLM-powered chatbot that analyzes visual creation in real-time while engaging the creator in reflective conversations. We conducted an evaluation with five experts in art therapy and related fields, which demonstrated the chatbot's potential to facilitate therapeutic engagement, and highlighted several areas for future development, including entryways and risk management, bespoke alignment of user profile and therapeutic style, balancing conversational depth and width, and enriching visual interactivity. These themes provide a design roadmap for designing the future AI-mediated creative expression tools.

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

TL;DR

Abstract

Paper Structure (12 sections, 2 figures, 1 table)

This paper contains 12 sections, 2 figures, 1 table.

Introduction and Backgrounds
System Design and Implementation
Design Rationale
Real-time drawing analysis without making assumptions.
Promoting active expression while allowing silence.
Encouraging visual externalization.
System Overview
Front-end
Back-end
Method
Findings
Future Work

Figures (2)

Figure 1: An overview of the system architecture. The image understanding module takes the client's drawing and the designed prompts as input, producing an image summary that the conversation module uses to generate responses.
Figure 2: The artworks created by the expert participants using our system during the study.

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

TL;DR

Abstract

Exploring a Multimodal Chatbot as a Facilitator in Therapeutic Art Activity

Authors

TL;DR

Abstract

Table of Contents

Figures (2)