What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Keyon Vafa, Sarah Bentley, Jon Kleinberg, Sendhil Mullainathan
TL;DR
This work introduces a principled framework to separate steerability from producibility in generative models and proposes a goal-induced benchmark to measure steerability in isolation. Through large-scale human studies on text-to-image models and LLMs, it reveals widespread poor steerability and highlights that even high-quality outputs do not guarantee effective user-directed steering. The authors develop a Blinder-like decomposition to disentangle improvements due to steering mechanisms from those due to producible output sets, showing both contribute substantially. Importantly, they demonstrate that steerability can be meaningfully improved with a simple image-based steering approach and an RL-learned steering distribution, achieving over 2× gains in steerability relative to text prompting. The findings underscore the need to prioritize steerability in model design and interaction interfaces to deliver goal-directed, user-satisfying performance in real-world use cases.
Abstract
How should we evaluate the quality of generative models? Many existing metrics focus on a model's producibility, i.e. the quality and breadth of outputs it can generate. However, the actual value from using a generative model stems not just from what it can produce but whether a user with a specific goal can produce an output that satisfies that goal. We refer to this property as steerability. In this paper, we first introduce a mathematical decomposition for quantifying steerability independently from producibility. Steerability is more challenging to evaluate than producibility because it requires knowing a user's goals. We address this issue by creating a benchmark task that relies on one key idea: sample an output from a generative model and ask users to reproduce it. We implement this benchmark in user studies of text-to-image and large language models. Despite the ability of these models to produce high-quality outputs, they all perform poorly on steerability. These results suggest that we need to focus on improving the steerability of generative models. We show such improvements are indeed possible: simple image-based steering mechanisms achieve more than 2x improvement on this benchmark.
