Illustrator's Depth: Monocular Layer Index Prediction for Image Decomposition
Nissim Maruani, Peiying Zhang, Siddhartha Chaudhuri, Matthew Fisher, Nanxuan Zhao, Vladimir G. Kim, Pierre Alliez, Mathieu Desbrun, Wang Yifan
TL;DR
Illustrator's Depth reframes depth as an editable, per-pixel layer index rather than a physical metric, enabling robust image decomposition into layer-ordered vector graphics. The approach trains a network to predict a continuous per-pixel depth map $D(I)$ from raster inputs by rasterizing layered SVGs into ground-truth depth via a base-256 encoding and applying a scale-invariant MAE objective. Coupled with a dedicated vectorization pipeline, the method achieves state-of-the-art layer ordering and visual fidelity, enabling high-quality vectorization, text-to-vector generation, and depth-aware editing, with additional benefits for 3D relief generation and tactile graphics. The work demonstrates strong generalization across diverse inputs and datasets, and points toward future zero-shot inference and broader applicability in creative workflows. Overall, Illustrator's Depth provides a practical, edit-friendly foundation for decomposing images into layered, manipulable representations.
Abstract
We introduce Illustrator's Depth, a novel definition of depth that addresses a key challenge in digital content creation: decomposing flat images into editable, ordered layers. Inspired by an artist's compositional process, illustrator's depth infers a layer index to each pixel, forming an interpretable image decomposition through a discrete, globally consistent ordering of elements optimized for editability. We also propose and train a neural network using a curated dataset of layered vector graphics to predict layering directly from raster inputs. Our layer index inference unlocks a range of powerful downstream applications. In particular, it significantly outperforms state-of-the-art baselines for image vectorization while also enabling high-fidelity text-to-vector-graphics generation, automatic 3D relief generation from 2D images, and intuitive depth-aware editing. By reframing depth from a physical quantity to a creative abstraction, illustrator's depth prediction offers a new foundation for editable image decomposition.
