Table of Contents
Fetching ...

ComfyGI: Automatic Improvement of Image Generation Workflows

Dominik Sobania, Martin Briesch, Franz Rothlauf

TL;DR

ComfyGI, a novel approach to automatically improve workflows for image generation without the need for human intervention driven by techniques from genetic improvement, enables image generation with significantly higher quality in terms of the alignment with the given description and the perceived aesthetics.

Abstract

Automatic image generation is no longer just of interest to researchers, but also to practitioners. However, current models are sensitive to the settings used and automatic optimization methods often require human involvement. To bridge this gap, we introduce ComfyGI, a novel approach to automatically improve workflows for image generation without the need for human intervention driven by techniques from genetic improvement. This enables image generation with significantly higher quality in terms of the alignment with the given description and the perceived aesthetics. On the performance side, we find that overall, the images generated with an optimized workflow are about 50% better compared to the initial workflow in terms of the median ImageReward score. These already good results are even surpassed in our human evaluation, as the participants preferred the images improved by ComfyGI in around 90% of the cases.

ComfyGI: Automatic Improvement of Image Generation Workflows

TL;DR

ComfyGI, a novel approach to automatically improve workflows for image generation without the need for human intervention driven by techniques from genetic improvement, enables image generation with significantly higher quality in terms of the alignment with the given description and the perceived aesthetics.

Abstract

Automatic image generation is no longer just of interest to researchers, but also to practitioners. However, current models are sensitive to the settings used and automatic optimization methods often require human involvement. To bridge this gap, we introduce ComfyGI, a novel approach to automatically improve workflows for image generation without the need for human intervention driven by techniques from genetic improvement. This enables image generation with significantly higher quality in terms of the alignment with the given description and the perceived aesthetics. On the performance side, we find that overall, the images generated with an optimized workflow are about 50% better compared to the initial workflow in terms of the median ImageReward score. These already good results are even surpassed in our human evaluation, as the participants preferred the images improved by ComfyGI in around 90% of the cases.

Paper Structure

This paper contains 22 sections, 23 figures, 4 tables.

Figures (23)

  • Figure 1: An example ComfyUI text-to-image workflow. The shown workflow's settings were optimized with ComfyGI and the initial prompt was "storefront with 'diffusion' written on it".
  • Figure 2: An illustration of ComfyGI's hill climbing method for improving workflows for text-to-image generation.
  • Figure 3: An example for image improvement with ComfyGI over several generations for the prompt "storefront with 'diffusion' written on it". For every generation, we show the image and the score for the best found patch so far.
  • Figure 4: Three examples for image improvement with ComfyGI. The left image shows the initial image and the right one the optimized counterpart.
  • Figure 5: Scores for the initial and optimized images.
  • ...and 18 more figures