Table of Contents
Fetching ...

An Embodied Companion for Visual Storytelling

Patrick Tresset, Markus Wulfmeier

Abstract

As artificial intelligence shifts from pure tool for delegation toward agentic collaboration, its use in the arts can shift beyond the exploration of machine autonomy toward synergistic co-creation. While our earlier robotic works utilized automation to distance the artist's intent from the final mark, we present Companion: an artistic apparatus that integrates a drawing robot with Large Language Models (LLMs) to re-center human-machine presence. By leveraging in-context learning and real-time tool use, the system engages in bidirectional interaction via speech and sketching. This approach transforms the robot from a passive executor into a playful co-creative partner capable of driving shared visual storytelling into unexpected aesthetic territories. To validate this collaborative shift, we employed the Consensual Assessment Technique (CAT) with a panel of seven art-world experts. Results confirm that the system produces works with a distinct aesthetic identity and professional exhibition merit, demonstrating the potential of AI as a highly capable artistic collaborator.

An Embodied Companion for Visual Storytelling

Abstract

As artificial intelligence shifts from pure tool for delegation toward agentic collaboration, its use in the arts can shift beyond the exploration of machine autonomy toward synergistic co-creation. While our earlier robotic works utilized automation to distance the artist's intent from the final mark, we present Companion: an artistic apparatus that integrates a drawing robot with Large Language Models (LLMs) to re-center human-machine presence. By leveraging in-context learning and real-time tool use, the system engages in bidirectional interaction via speech and sketching. This approach transforms the robot from a passive executor into a playful co-creative partner capable of driving shared visual storytelling into unexpected aesthetic territories. To validate this collaborative shift, we employed the Consensual Assessment Technique (CAT) with a panel of seven art-world experts. Results confirm that the system produces works with a distinct aesthetic identity and professional exhibition merit, demonstrating the potential of AI as a highly capable artistic collaborator.
Paper Structure (40 sections, 18 figures)

This paper contains 40 sections, 18 figures.

Figures (18)

  • Figure 1: Companion overview. The system drives an RNP robot composed of a planar 4-jointed arm, a pan and tilt camera, a microphone, and a speaker. Companion is composed of 3 modules: Yarpcom, an interface with RNP, Drawing system that implements the drawing tools such as draw line, draw segments, draw circles etc, Drawing agent that manages the communication with Gemini via the API, including the initialization with the system instructions and context.
  • Figure 2: The feather illustrates the 'human artistic director' scenario. Note the elongated arm drawn by the robot to 'catch' the feather, deviating from the user's suggestion to reposition the character. (Experiment \ref{['artistic-director']})
  • Figure 3: The Hare and the Turtle, (Experiment \ref{['hare-turtle-exp']}). A sequential representation of the fable. The turtle (multiple figures) progresses steadily across the page, while the hare stops early. The vertical line on the right represents the finish line
  • Figure 4: The Crow and the Fox (Experiment \ref{['crow-fox']}). Left: The first attempt. Middle: The two drawings shown by the user. Right: the agent's second attempt, with improved crow and fox depiction.
  • Figure 5: Try again. Fail again. Fail better (Experiment \ref{['try-fail']}), the four images show the different ways Companion illustrated the sentence
  • ...and 13 more figures