Table of Contents
Fetching ...

Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models

Georgia Argyrou, Angeliki Dimitriou, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou

TL;DR

Emphasizing adaptability in AI-driven fashion creativity, this work depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs.

Abstract

The advent of artificial intelligence has contributed in a groundbreaking transformation of the fashion industry, redefining creativity and innovation in unprecedented ways. This work investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models and a Stable Diffusion model for fashion image creation. Emphasizing adaptability in AI-driven fashion creativity, we depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs. Central to our methodology is Retrieval-Augmented Generation (RAG), enriching models with insights from fashion sources to ensure contemporary representations. Evaluation combines quantitative metrics such as CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles. Among the participants, RAG and few-shot learning techniques are preferred for their ability to produce more relevant and appealing fashion descriptions. Our code is provided at https://github.com/georgiarg/AutoFashion.

Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models

TL;DR

Emphasizing adaptability in AI-driven fashion creativity, this work depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs.

Abstract

The advent of artificial intelligence has contributed in a groundbreaking transformation of the fashion industry, redefining creativity and innovation in unprecedented ways. This work investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models and a Stable Diffusion model for fashion image creation. Emphasizing adaptability in AI-driven fashion creativity, we depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs. Central to our methodology is Retrieval-Augmented Generation (RAG), enriching models with insights from fashion sources to ensure contemporary representations. Evaluation combines quantitative metrics such as CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles. Among the participants, RAG and few-shot learning techniques are preferred for their ability to produce more relevant and appealing fashion descriptions. Our code is provided at https://github.com/georgiarg/AutoFashion.
Paper Structure (14 sections, 8 figures, 2 tables)

This paper contains 14 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Fashion image generation pipelines using different prompting techniques and RAG to produce fashion outfit descriptions fed to a Stable Diffusion module.
  • Figure 2: Demographic profiles of survey participants.
  • Figure 3: Evaluation comparison of fashion images generated by descriptions of different techniques including ZS, FS, CoT, RAG with PDFs, and RAG with BLOGs across various criteria. Criteria considered are Style Alignment, Occasion Suitability, Wearer's Type Suitability, Creativity, Aesthetic Appeal, and Cohesion, with average performance indicated by the blue line.
  • Figure 4: Analysis of abnormalities in images and their influence on fashion designers' inspiration.
  • Figure 5: Evaluation comparison of descriptions generated by different techniques including ZS, FS, CoT, RAG with PDFs, and RAG with BLOGs across various criteria.
  • ...and 3 more figures