Table of Contents
Fetching ...

Behavior-Aware Anthropometric Scene Generation for Human-Usable 3D Layouts

Semin Jin, Donghyuk Kim, Jeongmin Ryu, Kyung Hoon Hyun

TL;DR

A Behavior-Aware Anthropometric Scene Generation framework that bridges VLM-based interaction reasoning with anthropometric constraints, validated through both technical metrics and real-scale human usability studies is presented.

Abstract

Well-designed indoor scenes should prioritize how people can act within a space rather than merely what objects to place. However, existing 3D scene generation methods emphasize visual and semantic plausibility, while insufficiently addressing whether people can comfortably walk, sit, or manipulate objects. To bridge this gap, we present a Behavior-Aware Anthropometric Scene Generation framework. Our approach leverages vision-language models (VLMs) to analyze object-behavior relationships, translating spatial requirements into parametric layout constraints adapted to user-specific anthropometric data. We conducted comparative studies with state-of-the-art models using geometric metrics and a user perception study (N=16). We further conducted in-depth human-scale studies (individuals, N=20; groups, N=18). The results showed improvements in task completion time, trajectory efficiency, and human-object manipulation space. This study contributes a framework that bridges VLM-based interaction reasoning with anthropometric constraints, validated through both technical metrics and real-scale human usability studies.

Behavior-Aware Anthropometric Scene Generation for Human-Usable 3D Layouts

TL;DR

A Behavior-Aware Anthropometric Scene Generation framework that bridges VLM-based interaction reasoning with anthropometric constraints, validated through both technical metrics and real-scale human usability studies is presented.

Abstract

Well-designed indoor scenes should prioritize how people can act within a space rather than merely what objects to place. However, existing 3D scene generation methods emphasize visual and semantic plausibility, while insufficiently addressing whether people can comfortably walk, sit, or manipulate objects. To bridge this gap, we present a Behavior-Aware Anthropometric Scene Generation framework. Our approach leverages vision-language models (VLMs) to analyze object-behavior relationships, translating spatial requirements into parametric layout constraints adapted to user-specific anthropometric data. We conducted comparative studies with state-of-the-art models using geometric metrics and a user perception study (N=16). We further conducted in-depth human-scale studies (individuals, N=20; groups, N=18). The results showed improvements in task completion time, trajectory efficiency, and human-object manipulation space. This study contributes a framework that bridges VLM-based interaction reasoning with anthropometric constraints, validated through both technical metrics and real-scale human usability studies.
Paper Structure (69 sections, 4 equations, 12 figures, 3 tables)

This paper contains 69 sections, 4 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Overview of the Behavior-Aware Anthropometric Scene Generation. The framework proceeds in two phases: Semantic and Behavioral Representation (Stages A–E) constructs spatial relations, and Anthropometric Constraint-based Layout Generation (Stages F–G) optimizes the final layout using anthropometric constraints.
  • Figure 2: Semantic and Behavioral Representation (Stages B–D). For every 3D asset used in the scene, [B] preprocesses metadata and multi-view renderings, [C] infers a functional description, and [D] extracts a human–object interaction pattern.
  • Figure 3: [E] Semantic Grouping and [F] Anthropometric-based Constraint Inference.
  • Figure 4: Interface for User perception study. Red markers indicate floor-plan coordinates in meters, and arrows denote furniture orientations. Participants rated criteria on a 7-point Likert scale.
  • Figure 5: Experimental Procedure Flowcharts for Individual and Group Sessions.
  • ...and 7 more figures