Exploring Mobile Touch Interaction with Large Language Models
Tim Zindulka, Jannek Sekowski, Florian Lehmann, Daniel Buschek
TL;DR
This work designs a four-dimensional space for mobile touch interaction with large language models and demonstrates two continuous gestures, spread-to-generate and pinch-to-shorten, to control text generation directly within the editor. A novel visual feedback loop using word bubbles supports latency-prone streaming and enables a closed control loop, improving speed, perceived usability, and reducing cognitive load. In two within-subject experiments, the gestural interface with Bubble feedback outperformed both line-based feedback and a no-feedback baseline, and significantly surpassed a ChatGPT-like chatbot UI in efficiency and user experience. The study shows the feasibility and desirability of gesture-based, continuous interaction with LLMs on mobile devices and establishes a foundation for future gesture-based AI writing tools.
Abstract
Interacting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.
