Table of Contents
Fetching ...

Illustrious: an Open Advanced Illustration Model

Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, Min Song

TL;DR

This work explores the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations, and proposes the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development.

Abstract

In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art performance in terms of animation style, outperforming widely-used models in illustration domains, propelling easier customization and personalization with nature of open source. We plan to publicly release updated Illustrious model series sequentially as well as sustainable plans for improvements.

Illustrious: an Open Advanced Illustration Model

TL;DR

This work explores the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations, and proposes the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development.

Abstract

In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art performance in terms of animation style, outperforming widely-used models in illustration domains, propelling easier customization and personalization with nature of open source. We plan to publicly release updated Illustrious model series sequentially as well as sustainable plans for improvements.
Paper Structure (42 sections, 4 equations, 25 figures, 5 tables)

This paper contains 42 sections, 4 equations, 25 figures, 5 tables.

Figures (25)

  • Figure 1: High-quality samples from Illustrious. Our model exhibits vibrant color and contrast on a range of image styles.
  • Figure 2: Model Comparison Images
  • Figure 3: Comparison of gender distribution and example generations from the model showing bias and weak understanding of gender-specific terms. The used prompt was "1boy, doctor, masterpiece, looking at viewer".
  • Figure 4: Minimal data pruning strategy has allowed various concept genreation, including extremely rare ms-paint like concepts, without harnessing overall generation quality.
  • Figure 5: Character Similarity ELO Ratings Result, time-weighted average is applied. and Free-for-all ELO
  • ...and 20 more figures