Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example
Aven-Le Zhou, Yu-Ao Wang, Wei Wu, Kang Zhang
TL;DR
This work tackles the inefficiency and non-determinism of prompting large text-to-image models by introducing a prompting-free personalization pipeline that combines semantic injection with a real-time human-in-the-loop genetic prompting optimization. An artist model is built by semantically injecting Kandinsky/Bauhaus attributes via fast LoRA and DiffLoRA, then paired with a genetic algorithm that evolves prompts based on user feedback to yield personalized outputs without explicit prompts. The authors create a Kandinsky-focused dataset and demonstrate two experiments: first establishing an artist-tuned diffusion setup, then enabling users to converge on a preferred prompting strategy within a handful of iterations. The approach aims to democratize access to personalized, stylistically consistent image generation, and the authors provide open-source data and code to support further research and reuse.
Abstract
With the advancement of neural generative capabilities, the art community has actively embraced GenAI (generative artificial intelligence) for creating painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error, as users struggle with formulating effective prompts to achieve their desired results. This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content that incorporates their aesthetic preferences in a customized artistic style. This approach involves utilizing ``semantic injection'' to customize an artist model in a specific artistic style, and further leveraging a genetic algorithm to optimize the prompt generation process through real-time iterative human feedback. By solely relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model that encompasses their aesthetic preferences and the customized artistic style.
