Illustrious: an Open Advanced Illustration Model

Sang Hyun Park; Jun Young Koh; Junha Lee; Joy Song; Dongha Kim; Hoyeon Moon; Hyunju Lee; Min Song

Illustrious: an Open Advanced Illustration Model

Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, Min Song

TL;DR

This work explores the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations, and proposes the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development.

Abstract

In this work, we share the insights for achieving state-of-the-art quality in our text-to-image anime image generative model, called Illustrious. To achieve high resolution, dynamic color range images, and high restoration ability, we focus on three critical approaches for model improvement. First, we delve into the significance of the batch size and dropout control, which enables faster learning of controllable token based concept activations. Second, we increase the training resolution of images, affecting the accurate depiction of character anatomy in much higher resolution, extending its generation capability over 20MP with proper methods. Finally, we propose the refined multi-level captions, covering all tags and various natural language captions as a critical factor for model development. Through extensive analysis and experiments, Illustrious demonstrates state-of-the-art performance in terms of animation style, outperforming widely-used models in illustration domains, propelling easier customization and personalization with nature of open source. We plan to publicly release updated Illustrious model series sequentially as well as sustainable plans for improvements.

Illustrious: an Open Advanced Illustration Model

TL;DR

Abstract

Paper Structure (42 sections, 4 equations, 25 figures, 5 tables)

This paper contains 42 sections, 4 equations, 25 figures, 5 tables.

Introduction
Preliminary
SDXL
Illustration / Animation Domain
Next-generation Text-to-Image Generative Models
Features of next-generation models
Text Encoder
Data Ethics
Methodology
Dataset
Dataset Bias
Data Preprocessing
Resolution
Limited Corpus
Training Method
...and 27 more sections

Figures (25)

Figure 1: High-quality samples from Illustrious. Our model exhibits vibrant color and contrast on a range of image styles.
Figure 2: Model Comparison Images
Figure 3: Comparison of gender distribution and example generations from the model showing bias and weak understanding of gender-specific terms. The used prompt was "1boy, doctor, masterpiece, looking at viewer".
Figure 4: Minimal data pruning strategy has allowed various concept genreation, including extremely rare ms-paint like concepts, without harnessing overall generation quality.
Figure 5: Character Similarity ELO Ratings Result, time-weighted average is applied. and Free-for-all ELO
...and 20 more figures

Illustrious: an Open Advanced Illustration Model

TL;DR

Abstract

Illustrious: an Open Advanced Illustration Model

Authors

TL;DR

Abstract

Table of Contents

Figures (25)