Table of Contents
Fetching ...

MIXER: Mixed Hyperspherical Random Embedding Neural Network for Texture Recognition

Ricardo T. Fares, Lucas C. Ribas

TL;DR

Mixer tackles texture recognition by exploiting hyperspherical random embeddings within a four-module pipeline that jointly captures intra- and inter-channel texture information. The dual-branch Learning module—Direct and Mixed—enables both channel-wise reconstruction and cross-channel fusion, with a compression stage producing compact color-texture descriptors. Empirical results on multiple texture benchmarks demonstrate strong, across-dataset performance, including high average accuracy and state-of-the-art-like results against handcrafted and other randomized methods. The work highlights the practical impact of hyperspherical embeddings and cross-channel learning for robust texture representation and efficient descriptor construction.

Abstract

Randomized neural networks for representation learning have consistently achieved prominent results in texture recognition tasks, effectively combining the advantages of both traditional techniques and learning-based approaches. However, existing approaches have so far focused mainly on improving cross-information prediction, without introducing significant advancements to the overall randomized network architecture. In this paper, we propose Mixer, a novel randomized neural network for texture representation learning. At its core, the method leverages hyperspherical random embeddings coupled with a dual-branch learning module to capture both intra- and inter-channel relationships, further enhanced by a newly formulated optimization problem for building rich texture representations. Experimental results have shown the interesting results of the proposed approach across several pure texture benchmarks, each with distinct characteristics and challenges. The source code will be available upon publication.

MIXER: Mixed Hyperspherical Random Embedding Neural Network for Texture Recognition

TL;DR

Mixer tackles texture recognition by exploiting hyperspherical random embeddings within a four-module pipeline that jointly captures intra- and inter-channel texture information. The dual-branch Learning module—Direct and Mixed—enables both channel-wise reconstruction and cross-channel fusion, with a compression stage producing compact color-texture descriptors. Empirical results on multiple texture benchmarks demonstrate strong, across-dataset performance, including high average accuracy and state-of-the-art-like results against handcrafted and other randomized methods. The work highlights the practical impact of hyperspherical embeddings and cross-channel learning for robust texture representation and efficient descriptor construction.

Abstract

Randomized neural networks for representation learning have consistently achieved prominent results in texture recognition tasks, effectively combining the advantages of both traditional techniques and learning-based approaches. However, existing approaches have so far focused mainly on improving cross-information prediction, without introducing significant advancements to the overall randomized network architecture. In this paper, we propose Mixer, a novel randomized neural network for texture representation learning. At its core, the method leverages hyperspherical random embeddings coupled with a dual-branch learning module to capture both intra- and inter-channel relationships, further enhanced by a newly formulated optimization problem for building rich texture representations. Experimental results have shown the interesting results of the proposed approach across several pure texture benchmarks, each with distinct characteristics and challenges. The source code will be available upon publication.

Paper Structure

This paper contains 11 sections, 19 equations, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: Overview of the Mixer pipeline. The input image $\mathbf{I} \in \mathbb{R}^{C \times H \times W}$ is fed to the Local Pattern Extractor (LPE) module, which pads the image and subsequently performs the extraction of tiny patches to record the raw texture information. Thereafter, these patches are fed to the Hyperspherical Random Projector (HRP) module that maps these patches in hyperspherical random embeddings composing the random projected matrices $\mathbf{Z}"_i \in \mathbb{R}^{(\omega + 1) \times HW}$. The projected matrices are fed to both Direct and Mixed branches responsible for learning the intra- and inter-channel local intensity relationships, respectively. The linear decoder's learned weights from both modules are fed to the compression module, which is responsible for vertically concatenating them, resulting in the aggregated learned weight matrix, and is responsible for applying selected compression functions to compress the weight matrix into a useful color-texture representation.
  • Figure 2: Accuracy (%) behavior of the color-texture representation $\mathbf{\Omega}_{59}(\mathbf{I})$ for all the benchmark datasets as the regularization of the Direct and Mixed branches varies.
  • Figure 3: Accuracy (%) behavior of the color-texture representation $\mathbf{\Omega}_{59}(\mathbf{I})$ for the Outex & CUReT and USPtex & MBT dataset pairs as the regularization of the Direct and Mixed branches varies. The behavior is the average of the accuracies obtained by the datasets in each configuration. The inset plot in each figure refers to the region near to the highest average accuracy. Note: The color bar of the inset plot was adjusted to help the visualization of the highest average accuracy.
  • Figure 4: Average accuracy (%) behavior of the proposed texture representation $\mathbf{\Omega}_{\omega}(\mathbf{I})$ when either one of the branches are being used or both of them are.
  • Figure 5: Accuracy (%) behavior of the proposed texture representation $\mathbf{\Omega}_{\omega}(\mathbf{I})$ as the random embedding size $\omega$ varies in the defined parameter space. This behavior analysis is presented for all benchmark datasets.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Definition 3.1: Compression Function