Table of Contents
Fetching ...

An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic Evaluation

Yuki Hirakawa, Takashi Wada, Kazuya Morishita, Ryotaro Shimizu, Takuya Furusawa, Sai Htaung Kham, Yuki Saito

TL;DR

This work examines the zero-shot performance of GPT-4V on fashion aesthetic evaluation, and shows that its predictions align fairly well with human judgments on datasets, and also finds that it struggles with ranking outfits in similar colors.

Abstract

Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time. We show that its predictions align fairly well with human judgments on our datasets, and also find that it struggles with ranking outfits in similar colors. The code is available at https://github.com/st-tech/gpt4v-fashion-aesthetic-evaluation.

An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic Evaluation

TL;DR

This work examines the zero-shot performance of GPT-4V on fashion aesthetic evaluation, and shows that its predictions align fairly well with human judgments on datasets, and also finds that it struggles with ranking outfits in similar colors.

Abstract

Fashion aesthetic evaluation is the task of estimating how well the outfits worn by individuals in images suit them. In this work, we examine the zero-shot performance of GPT-4V on this task for the first time. We show that its predictions align fairly well with human judgments on our datasets, and also find that it struggles with ranking outfits in similar colors. The code is available at https://github.com/st-tech/gpt4v-fashion-aesthetic-evaluation.

Paper Structure

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The prompt used in our experiments.
  • Figure 2: sbj-1
  • Figure 3: sbj-2
  • Figure 4: sbj-3-b