Table of Contents
Fetching ...

An HCI-Centric Survey and Taxonomy of Human-Generative-AI Interactions

Jingyu Shi, Rahul Jain, Hyungjun Doh, Ryo Suzuki, Karthik Ramani

TL;DR

This paper surveys 291 studies to build a comprehensive HCI-centric taxonomy of Human-GenAI Interactions. It defines six design dimensions—Purposes of Using GenAI, Feedback from Models to Users, Control from Users to Models, Levels of Engagement, Application Domains, and Evaluation Strategies—and details a rigorous two-wave, PRISMA-like methodology to assemble and code the corpus. The dual-perspective taxonomy (human and GenAI model viewpoints) is complemented by discussions of ethical, social, and fairness considerations and a forward-looking set of opportunities, including foundation-model–driven interactions and novel input modalities. The work identifies key design gaps, notably the dominance of direct control interfaces, the scarcity of process-interpretability tools, and the under-exploration of ethics, laying a foundation for more transparent, multimodal, and user-understandable GenAI systems with practical impact across domains.

Abstract

Generative AI (GenAI) has shown remarkable capabilities in generating diverse and realistic content across different formats like images, videos, and text. In Generative AI, human involvement is essential, thus HCI literature has investigated how to effectively create collaborations between humans and GenAI systems. However, the current literature lacks a comprehensive framework to better understand Human-GenAI Interactions, as the holistic aspects of human-centered GenAI systems are rarely analyzed systematically. In this paper, we present a survey of 291 papers, providing a novel taxonomy and analysis of Human-GenAI Interactions from both human and Gen-AI perspectives. The dimensions of design space include 1) Purposes of Using Generative AI, 2) Feedback from Models to Users, 3) Control from Users to Models, 4) Levels of Engagement, 5) Application Domains, and 6) Evaluation Strategies. Our work is also timely at the current development stage of GenAI, where the Human-GenAI interaction design is of paramount importance. We also highlight challenges and opportunities to guide the design of Gen-AI systems and interactions towards the future design of human-centered Generative AI applications.

An HCI-Centric Survey and Taxonomy of Human-Generative-AI Interactions

TL;DR

This paper surveys 291 studies to build a comprehensive HCI-centric taxonomy of Human-GenAI Interactions. It defines six design dimensions—Purposes of Using GenAI, Feedback from Models to Users, Control from Users to Models, Levels of Engagement, Application Domains, and Evaluation Strategies—and details a rigorous two-wave, PRISMA-like methodology to assemble and code the corpus. The dual-perspective taxonomy (human and GenAI model viewpoints) is complemented by discussions of ethical, social, and fairness considerations and a forward-looking set of opportunities, including foundation-model–driven interactions and novel input modalities. The work identifies key design gaps, notably the dominance of direct control interfaces, the scarcity of process-interpretability tools, and the under-exploration of ethics, laying a foundation for more transparent, multimodal, and user-understandable GenAI systems with practical impact across domains.

Abstract

Generative AI (GenAI) has shown remarkable capabilities in generating diverse and realistic content across different formats like images, videos, and text. In Generative AI, human involvement is essential, thus HCI literature has investigated how to effectively create collaborations between humans and GenAI systems. However, the current literature lacks a comprehensive framework to better understand Human-GenAI Interactions, as the holistic aspects of human-centered GenAI systems are rarely analyzed systematically. In this paper, we present a survey of 291 papers, providing a novel taxonomy and analysis of Human-GenAI Interactions from both human and Gen-AI perspectives. The dimensions of design space include 1) Purposes of Using Generative AI, 2) Feedback from Models to Users, 3) Control from Users to Models, 4) Levels of Engagement, 5) Application Domains, and 6) Evaluation Strategies. Our work is also timely at the current development stage of GenAI, where the Human-GenAI interaction design is of paramount importance. We also highlight challenges and opportunities to guide the design of Gen-AI systems and interactions towards the future design of human-centered Generative AI applications.
Paper Structure (24 sections, 10 figures, 15 tables)

This paper contains 24 sections, 10 figures, 15 tables.

Figures (10)

  • Figure 1: Visual abstract of our survey and taxonomy of Human-GenAI Interaction. Our taxonomy summarizes five key dimensions, namely, Purposes of Using GenAI, Feedback from Models to Humans, Control from Humans to Models, Levels of Engagement, and Application Domains.
  • Figure 2: Examples of GenAI applications located in our survey, covering the topics of research: A) Embodied interactions with GenAI noyman2020deepscope, B) Direct Control human to AI spape2021brain, C) Human Interpretable kahng2018gan, D) Gen AI enhancing skill wang2021soloist, E) Automate Process capps2021using, F) Human controllability dang2022ganslider, G) Natural Language Generation goodman2022lampost, H) Human AI collaboration twomey2022three, I) Personalization and Adaptation han2021designing, J) Conversational GenAI janssens2022cool
  • Figure 3: Flowchart for our paper selection process
  • Figure 4: Purposes of Using GenAI depict the users' intention of the interactions and the high-level capabilities of the applications, consisting of Refine Outcomes bau2020semantic, Explore Alternatives wan2023gancollage, Get Answers to Inquiries kim2020answering, Automate Processes truong2021automatic, Enhance Experiences shirazi2021supervised, Augment Sample Data chu2023wordgesture, and Understand park2023generative
  • Figure 5: Output Modalities consists of four categories, namely textual (text chung2022talebrush, chat jo2023understanding, code kazemitabaar2023studying, font kadner2021adaptifont, and hand-writing aksan2018deepwriting), 2D visual (image bau2020semantic, sketch fan2019collabdraw, slide arakawa2023catalyst, video yoo2021virtual, spatial AR kimura2018extvision, and visualizations of data han2021designing), layout (game layout volz2018evolving), web layout kim2022stylette, graphic layout guo2021vinci, and floor plan he2022iplan), numerical data (robot control sequence huang2022inner), audio (music suh2021ai, sound effect chang2018perceptual, and voice janssens2022cool), and 3D graphics (3D model liu20233dall, 3D motion xu2021gan, and XR scene nakano2019enchanting)
  • ...and 5 more figures