Table of Contents
Fetching ...

Using Text-to-Image Generation for Architectural Design Ideation

Ville Paananen, Jonas Oppenlaender, Aku Visuri

TL;DR

The paper investigates whether text-to-image generation can enhance creativity during the fuzzy front end of architectural design. It reports an empirical lab study with 17 architecture students using Midjourney, Stable Diffusion, and DALL-E, employing the Creativity Support Index and semi-structured interviews. Findings indicate that image generation can meaningfully support idea discovery and imaginative thinking when design constraints are explicit, though challenges remain in floorplan fidelity, material/facade rendering, and managing randomness; the study also yields practical recommendations for tool design and design education. These results offer actionable guidance for developers and educators aiming to harness generative AI to foster innovation, communication, and reflective practice in architectural design.

Abstract

The recent progress of text-to-image generation has been recognized in architectural design. Our study is the first to investigate the potential of text-to-image generators in supporting creativity during the early stages of the architectural design process. We conducted a laboratory study with 17 architecture students, who developed a concept for a culture center using three popular text-to-image generators: Midjourney, Stable Diffusion, and DALL-E. Through standardized questionnaires and group interviews, we found that image generation could be a meaningful part of the design process when design constraints are carefully considered. Generative tools support serendipitous discovery of ideas and an imaginative mindset, enriching the design process. We identified several challenges of image generators and provided considerations for software development and educators to support creativity and emphasize designers' imaginative mindset. By understanding the limitations and potential of text-to-image generators, architects and designers can leverage this technology in their design process and education, facilitating innovation and effective communication of concepts.

Using Text-to-Image Generation for Architectural Design Ideation

TL;DR

The paper investigates whether text-to-image generation can enhance creativity during the fuzzy front end of architectural design. It reports an empirical lab study with 17 architecture students using Midjourney, Stable Diffusion, and DALL-E, employing the Creativity Support Index and semi-structured interviews. Findings indicate that image generation can meaningfully support idea discovery and imaginative thinking when design constraints are explicit, though challenges remain in floorplan fidelity, material/facade rendering, and managing randomness; the study also yields practical recommendations for tool design and design education. These results offer actionable guidance for developers and educators aiming to harness generative AI to foster innovation, communication, and reflective practice in architectural design.

Abstract

The recent progress of text-to-image generation has been recognized in architectural design. Our study is the first to investigate the potential of text-to-image generators in supporting creativity during the early stages of the architectural design process. We conducted a laboratory study with 17 architecture students, who developed a concept for a culture center using three popular text-to-image generators: Midjourney, Stable Diffusion, and DALL-E. Through standardized questionnaires and group interviews, we found that image generation could be a meaningful part of the design process when design constraints are carefully considered. Generative tools support serendipitous discovery of ideas and an imaginative mindset, enriching the design process. We identified several challenges of image generators and provided considerations for software development and educators to support creativity and emphasize designers' imaginative mindset. By understanding the limitations and potential of text-to-image generators, architects and designers can leverage this technology in their design process and education, facilitating innovation and effective communication of concepts.
Paper Structure (28 sections, 4 figures, 2 tables)

This paper contains 28 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The three participants' floorplans, interior views, and facade materials voted best works from their respective sessions S1--S3.
  • Figure 2: Selection of images highlighting how the participants used image generators in unexpected ways. In (a), P1 focused on representing a floorplan in an abstract style with a prompt "watercolour plan view thick black walls." In (b), P7 used a more ornate style for the facade. In (c), P9 wanted to create a "honeycomb"-style material, which produced an actual honeycomb. In (d), P13 employed the design task's site by including factory chimneys, effectively suggesting location. In (e), P3 experimented with using strawberry as a facade material, and in (f), P13 could not generate usable floorplans so they went for a more experimental approach.
  • Figure 3: The length of participants' prompt sequences demonstrates the commitment of participants to a train of thought during the ideation session. Most ideas would spawn at least a few ($<$4) prompts before the participant moved to a new prompt sequence.
  • Figure 4: Most frequently used tokens in participant-written prompts. The plot on the left depicts the 25 most frequent tokens with stop words removed. The plot on the right depicts the 25 most frequent n-grams.