AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers

Nghia Vu; Tuong Do; Khang Nguyen; Baoru Huang; Nhat Le; Binh Xuan Nguyen; Erman Tjiputra; Quang D. Tran; Ravi Prakash; Te-Chuan Chiu; Anh Nguyen

AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers

Nghia Vu, Tuong Do, Khang Nguyen, Baoru Huang, Nhat Le, Binh Xuan Nguyen, Erman Tjiputra, Quang D. Tran, Ravi Prakash, Te-Chuan Chiu, Anh Nguyen

Abstract

Affordance learning is a complex challenge in many applications, where existing approaches primarily focus on the geometric structures, visual knowledge, and affordance labels of objects to determine interactable regions. However, extending this learning capability to a scene is significantly more complicated, as incorporating object- and scene-level semantics is not straightforward. In this work, we introduce AffordBridge, a large-scale dataset with 291,637 functional interaction annotations across 685 high-resolution indoor scenes in the form of point clouds. Our affordance annotations are complemented by RGB images that are linked to the same instances within the scenes. Building upon our dataset, we propose AffordMatcher, an affordance learning method that establishes coherent semantic correspondences between image-based and point cloud-based instances for keypoint matching, enabling a more precise identification of affordance regions based on cues, so-called visual signifiers. Experimental results on our dataset demonstrate the effectiveness of our approach compared to other methods.

AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers

Abstract

AffordMatcher: Affordance Learning in 3D Scenes from Visual Signifiers

Abstract

Paper Structure

Table of Contents

Figures (12)