Table of Contents
Fetching ...

Minecraft-ify: Minecraft Style Image Generation with Text-guided Image Editing for In-Game Application

Bumsoo Kim, Sanghyun Byun, Yonghoon Jung, Wonseop Shin, Sareer UI Amin, Sanghyun Seo

TL;DR

This work addresses Minecraft-style texture generation for in-game assets with editable, text-guided control. It adapts StyleGAN and StyleCLIP to produce and manipulate 8\times 8 textures for cube-mesh characters, enabling inversion of real inputs and generation from a learned distribution through a fine-tuned generator $\tilde{G}$. The method combines a latent-space inversion objective with a statistics loss $L_{stat}$ and CLIP-guided latent editing, allowing semantically meaningful edits via text prompts and options to use the average $\bar{w}$ or random $w_{random}$. The results demonstrate semantically plausible texture edits and flexible texture generation, enabling user-friendly asset creation while acknowledging CLIP-data bias and dataset provenance considerations.

Abstract

In this paper, we first present the character texture generation system \textit{Minecraft-ify}, specified to Minecraft video game toward in-game application. Ours can generate face-focused image for texture mapping tailored to 3D virtual character having cube manifold. While existing projects or works only generate texture, proposed system can inverse the user-provided real image, or generate average/random appearance from learned distribution. Moreover, it can be manipulated with text-guidance using StyleGAN and StyleCLIP. These features provide a more extended user experience with enlarged freedom as a user-friendly AI-tool. Project page can be found at https://gh-bumsookim.github.io/Minecraft-ify/

Minecraft-ify: Minecraft Style Image Generation with Text-guided Image Editing for In-Game Application

TL;DR

This work addresses Minecraft-style texture generation for in-game assets with editable, text-guided control. It adapts StyleGAN and StyleCLIP to produce and manipulate 8\times 8 textures for cube-mesh characters, enabling inversion of real inputs and generation from a learned distribution through a fine-tuned generator . The method combines a latent-space inversion objective with a statistics loss and CLIP-guided latent editing, allowing semantically meaningful edits via text prompts and options to use the average or random . The results demonstrate semantically plausible texture edits and flexible texture generation, enabling user-friendly asset creation while acknowledging CLIP-data bias and dataset provenance considerations.

Abstract

In this paper, we first present the character texture generation system \textit{Minecraft-ify}, specified to Minecraft video game toward in-game application. Ours can generate face-focused image for texture mapping tailored to 3D virtual character having cube manifold. While existing projects or works only generate texture, proposed system can inverse the user-provided real image, or generate average/random appearance from learned distribution. Moreover, it can be manipulated with text-guidance using StyleGAN and StyleCLIP. These features provide a more extended user experience with enlarged freedom as a user-friendly AI-tool. Project page can be found at https://gh-bumsookim.github.io/Minecraft-ify/
Paper Structure (10 sections, 3 equations, 6 figures)

This paper contains 10 sections, 3 equations, 6 figures.

Figures (6)

  • Figure 1: Rendered 3D character in Minecraft-World using our generated frontal character texture.
  • Figure 2: Overview of our Minecraft-ify system.
  • Figure 3: Dataset refinement result.
  • Figure 4: Additional results with famous animation characters.
  • Figure 5: Random generated texture from learned distribution.
  • ...and 1 more figures