Table of Contents
Fetching ...

GraphTEN: Graph Enhanced Texture Encoding Network

Bo Peng, Jintao Chen, Mufeng Yao, Chenhao Zhang, Jianghui Zhang, Mingmin Chi, Jiang Tao

TL;DR

This work tackles robust texture recognition under irregular primitive distributions by introducing GraphTEN, which combines context-aware graphs with a cross-scale bipartite graph and a patch-encoding module. The approach transforms CNN features into a graph representation to capture local and global texture relations and uses a codebook-based patch encoding to achieve orderless multi-scale texture representations. Empirical results on six datasets demonstrate state-of-the-art performance on five benchmarks, highlighting the value of graph-based correlation and patch-based encoding for texture analysis. The work advances texture representation with scalable, non-local modeling suitable for real-world material and texture understanding.

Abstract

Texture recognition is a fundamental problem in computer vision and pattern recognition. Recent progress leverages feature aggregation into discriminative descriptions based on convolutional neural networks (CNNs). However, modeling non-local context relations through visual primitives remains challenging due to the variability and randomness of texture primitives in spatial distributions. In this paper, we propose a graph-enhanced texture encoding network (GraphTEN) designed to capture both local and global features of texture primitives. GraphTEN models global associations through fully connected graphs and captures cross-scale dependencies of texture primitives via bipartite graphs. Additionally, we introduce a patch encoding module that utilizes a codebook to achieve an orderless representation of texture by encoding multi-scale patch features into a unified feature space. The proposed GraphTEN achieves superior performance compared to state-of-the-art methods across five publicly available datasets.

GraphTEN: Graph Enhanced Texture Encoding Network

TL;DR

This work tackles robust texture recognition under irregular primitive distributions by introducing GraphTEN, which combines context-aware graphs with a cross-scale bipartite graph and a patch-encoding module. The approach transforms CNN features into a graph representation to capture local and global texture relations and uses a codebook-based patch encoding to achieve orderless multi-scale texture representations. Empirical results on six datasets demonstrate state-of-the-art performance on five benchmarks, highlighting the value of graph-based correlation and patch-based encoding for texture analysis. The work advances texture representation with scalable, non-local modeling suitable for real-world material and texture understanding.

Abstract

Texture recognition is a fundamental problem in computer vision and pattern recognition. Recent progress leverages feature aggregation into discriminative descriptions based on convolutional neural networks (CNNs). However, modeling non-local context relations through visual primitives remains challenging due to the variability and randomness of texture primitives in spatial distributions. In this paper, we propose a graph-enhanced texture encoding network (GraphTEN) designed to capture both local and global features of texture primitives. GraphTEN models global associations through fully connected graphs and captures cross-scale dependencies of texture primitives via bipartite graphs. Additionally, we introduce a patch encoding module that utilizes a codebook to achieve an orderless representation of texture by encoding multi-scale patch features into a unified feature space. The proposed GraphTEN achieves superior performance compared to state-of-the-art methods across five publicly available datasets.

Paper Structure

This paper contains 10 sections, 10 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Texture primitives demonstrate significant local and global associations across multiple scales, with solid lines representing local dependencies and dashed lines capturing non-local relationships.
  • Figure 2: Overall architecture of the proposed Graph enhanced texture encoding network (GraphTEN).
  • Figure 3: Comparing our method with benchmarks via confusion matrix.
  • Figure 4: Confusion analysis. The samples at the top row are incorrectly classified into the class of the corresponding samples at the bottom rows.