Table of Contents
Fetching ...

FlexMind: Supporting Deeper Creative Thinking with LLMs

Yaqing Yang, Vikram Mohanty, Yan-Ying Chen, Matthew K. Hong, Nikolas Martelaro, Aniket Kittur

TL;DR

FlexMind tackles the dual challenge of divergent ideation and convergent evaluation by coupling a schema-based breadth view with branchable idea trees that support a trade-off–mitigation loop. The system maintains human agency while leveraging AI to surface diverse ideas, critique risks, and generate targeted mitigations, organized on a visual canvas for easy navigation across multiple threads. In controlled and expert studies, FlexMind yields higher-quality ideas and deeper, more reflective exploration than a ChatGPT baseline, with longer idea chains positively correlating with quality. The findings suggest that structured, human-centered AI ideation tools can expand the design space explored, enhance evaluation-driven cognition, and improve practical creative outcomes in early-stage design.

Abstract

Effective ideation requires both broad exploration of diverse ideas and deep evaluation of their potential. Generative AI can support such processes, but current tools typically emphasize either generating many ideas or supporting in-depth consideration of a few, lacking support for both. Research also highlights risks of over-reliance on LLMs, including shallow exploration and negative creative outcomes. We present FlexMind, an AI-augmented system that scaffolds iterative exploration of ideas, tradeoffs, and mitigations. FlexMind exposes users to a broad set of ideas while enabling a lightweight transition into deeper engagement. In a study comparing ideation with FlexMind to ChatGPT, participants generated higher-quality ideas with FlexMind, due to both broader exposure and deeper engagement with tradeoffs. By scaffolding ideation across breadth, depth, and reflective evaluation, FlexMind empowers users to surface ideas that might otherwise go unnoticed or be prematurely discarded.

FlexMind: Supporting Deeper Creative Thinking with LLMs

TL;DR

FlexMind tackles the dual challenge of divergent ideation and convergent evaluation by coupling a schema-based breadth view with branchable idea trees that support a trade-off–mitigation loop. The system maintains human agency while leveraging AI to surface diverse ideas, critique risks, and generate targeted mitigations, organized on a visual canvas for easy navigation across multiple threads. In controlled and expert studies, FlexMind yields higher-quality ideas and deeper, more reflective exploration than a ChatGPT baseline, with longer idea chains positively correlating with quality. The findings suggest that structured, human-centered AI ideation tools can expand the design space explored, enhance evaluation-driven cognition, and improve practical creative outcomes in early-stage design.

Abstract

Effective ideation requires both broad exploration of diverse ideas and deep evaluation of their potential. Generative AI can support such processes, but current tools typically emphasize either generating many ideas or supporting in-depth consideration of a few, lacking support for both. Research also highlights risks of over-reliance on LLMs, including shallow exploration and negative creative outcomes. We present FlexMind, an AI-augmented system that scaffolds iterative exploration of ideas, tradeoffs, and mitigations. FlexMind exposes users to a broad set of ideas while enabling a lightweight transition into deeper engagement. In a study comparing ideation with FlexMind to ChatGPT, participants generated higher-quality ideas with FlexMind, due to both broader exposure and deeper engagement with tradeoffs. By scaffolding ideation across breadth, depth, and reflective evaluation, FlexMind empowers users to surface ideas that might otherwise go unnoticed or be prematurely discarded.

Paper Structure

This paper contains 66 sections, 12 figures, 11 tables.

Figures (12)

  • Figure 1: FlexMind: Preview and Canvas Pages. The preview page (left) presents the design brief (a), allows users to (b) add their own ideas, and (c) browse system-generated ideas organized by schemas (categories). Users can (d) select specific ideas to transfer into the workspace. The canvas page (right) displays selected ideas as linked cards in branchable idea trees on the canvas, while the sidebar (e) organizes categories, saved ideas, and search to support breadth-oriented exploration. Green cards denote solutions and red cards denote trade-offs.
  • Figure 2: Trade-off $\rightarrow$ Mitigation Exploration in FlexMind. Users begin a trade-off analysis for a given solution card (e.g., “Lemon Spray”) by clicking the (a) Trade-off button to reveal potential limitations (red cards). To address a trade-off, they can click the (c) Solution button, which generates targeted mitigation ideas (green cards). Users may also (b) contribute their own solutions or (d) add new trade-offs. The system maintains context of the design task while supporting iterative exploration of trade-offs and mitigations as branching idea trees.
  • Figure 3: Schema-based Exploration and Q&A in FlexMind. Users can surface high-level schemas to find similar ideas: selecting the (a) Similar button displays high-level related categories (b), which expand into alternative solutions. To probe further, users can (e) input free-form queries through the Q&A feature; with context from the originating card, the system returns targeted answers (f). Together, schema-level abstraction and card-level Q&A help users broaden exploration while obtaining instant, context-sensitive information.
  • Figure 4: Annotation example in the baseline condition. In the baseline (ChatGPT-only) condition, user prompts are manually reconstructed into trees. Green cards represent solutions, red cards represent trade-offs, and yellow cards represent other knowledge. Nodes (a) and (b) show two trees initiated from user prompts for solutions. Node (c) shows a user asking ChatGPT to analyze trade-offs, while node (d) shows a request for similar solutions. Node (e) illustrates a user asking follow-up questions on prior information. Node (f) and (g) show abstract visualizations of the same trees, indicating how nodes and branches are counted.
  • Figure 5: Expert ratings of idea quality. Average ratings across novelty, feasibility, and value for ideas in the baseline condition ($n=140$) and FlexMind condition ($n=130$). FlexMind ideas received significantly higher scores overall, with improvements in novelty and value ($p<.001$) and feasibility ($p<.05$). Bars show means; error bars show standard errors. Ideas judged by experts as too vague to evaluate ($n=9$ baseline, $n=2$ FlexMind) and the ideas used for initial rubric discussion ($n=22$ total; 11 from each condition) are excluded.
  • ...and 7 more figures