CSF-Net: Context-Semantic Fusion Network for Large Mask Inpainting

Chae-Yeon Heo; Yeong-Jun Cho

CSF-Net: Context-Semantic Fusion Network for Large Mask Inpainting

Chae-Yeon Heo, Yeong-Jun Cho

TL;DR

CSF-Net tackles large-mask inpainting by supplying semantic priors through an amodal completion model and fusing them with contextual features via a dual-encoder Swin Transformer to produce a semantic guidance image. This guidance reduces object hallucination and improves structural and semantic fidelity across diverse masks and datasets, while requiring no changes to existing inpainting architectures. The approach combines structure-aware candidate generation, transformer-based fusion, and hierarchical pixel selection with carefully designed losses to ensure cross-scale consistency. Empirical results on Places365 and COCOA show robust improvements over state-of-the-art baselines, highlighting the method's practicality and scalability for real-world inpainting tasks.

Abstract

In this paper, we propose a semantic-guided framework to address the challenging problem of large-mask image inpainting, where essential visual content is missing and contextual cues are limited. To compensate for the limited context, we leverage a pretrained Amodal Completion (AC) model to generate structure-aware candidates that serve as semantic priors for the missing regions. We introduce Context-Semantic Fusion Network (CSF-Net), a transformer-based fusion framework that fuses these candidates with contextual features to produce a semantic guidance image for image inpainting. This guidance improves inpainting quality by promoting structural accuracy and semantic consistency. CSF-Net can be seamlessly integrated into existing inpainting models without architectural changes and consistently enhances performance across diverse masking conditions. Extensive experiments on the Places365 and COCOA datasets demonstrate that CSF-Net effectively reduces object hallucination while enhancing visual realism and semantic alignment. The code for CSF-Net is available at https://github.com/chaeyeonheo/CSF-Net.

CSF-Net: Context-Semantic Fusion Network for Large Mask Inpainting

TL;DR

Abstract

CSF-Net: Context-Semantic Fusion Network for Large Mask Inpainting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)