Table of Contents
Fetching ...

KARMA: Efficient Structural Defect Segmentation via Kolmogorov-Arnold Representation Learning

Md Meftahul Ferdaus, Mahdi Abdelguerfi, Elias Ioup, Steven Sloan, Kendall N. Niles, Ken Pathak

TL;DR

KARMA addresses the challenge of accurate, pixel-level structural defect segmentation under variable appearance and severe class imbalance, by integrating Kolmogorov–Arnol d representation learning into an adaptive feature pyramid framework. The core method, TiKAN, employs a low-rank base transformation and learnable spline nonlinearities to realize parameter-efficient KA representations within an AFPN backbone, augmented by a static–dynamic prototype mechanism for robust minority-class handling. Empirical results on S2DS and CSDD show KARMA achieving competitive or superior mean IoU with up to ~97% fewer parameters and 0.264 GFLOPS, while delivering real-time inference (e.g., 78.1 FPS on GPU) and favorable edge-device memory usage. Real-world hardware validation on NVIDIA Jetson platforms confirms deployment readiness for automated inspection systems, with ablations confirming the importance of low-rank adaptation, separable convolutions, and the prototype mechanism for efficiency and accuracy. The work suggests a practical and scalable pathway for on-device structural defect analysis using KA-based representations, offering a compelling alternative to heavier CNN/Transformer-based models in resource-constrained settings.

Abstract

Semantic segmentation of structural defects in civil infrastructure remains challenging due to variable defect appearances, harsh imaging conditions, and significant class imbalance. Current deep learning methods, despite their effectiveness, typically require millions of parameters, rendering them impractical for real-time inspection systems. We introduce KARMA (Kolmogorov-Arnold Representation Mapping Architecture), a highly efficient semantic segmentation framework that models complex defect patterns through compositions of one-dimensional functions rather than conventional convolutions. KARMA features three technical innovations: (1) a parameter-efficient Tiny Kolmogorov-Arnold Network (TiKAN) module leveraging low-rank factorization for KAN-based feature transformation; (2) an optimized feature pyramid structure with separable convolutions for multi-scale defect analysis; and (3) a static-dynamic prototype mechanism that enhances feature representation for imbalanced classes. Extensive experiments on benchmark infrastructure inspection datasets demonstrate that KARMA achieves competitive or superior mean IoU performance compared to state-of-the-art approaches, while using significantly fewer parameters (0.959M vs. 31.04M, a 97% reduction). Operating at 0.264 GFLOPS, KARMA maintains inference speeds suitable for real-time deployment, enabling practical automated infrastructure inspection systems without compromising accuracy. The source code can be accessed at the following URL: https://github.com/faeyelab/karma.

KARMA: Efficient Structural Defect Segmentation via Kolmogorov-Arnold Representation Learning

TL;DR

KARMA addresses the challenge of accurate, pixel-level structural defect segmentation under variable appearance and severe class imbalance, by integrating Kolmogorov–Arnol d representation learning into an adaptive feature pyramid framework. The core method, TiKAN, employs a low-rank base transformation and learnable spline nonlinearities to realize parameter-efficient KA representations within an AFPN backbone, augmented by a static–dynamic prototype mechanism for robust minority-class handling. Empirical results on S2DS and CSDD show KARMA achieving competitive or superior mean IoU with up to ~97% fewer parameters and 0.264 GFLOPS, while delivering real-time inference (e.g., 78.1 FPS on GPU) and favorable edge-device memory usage. Real-world hardware validation on NVIDIA Jetson platforms confirms deployment readiness for automated inspection systems, with ablations confirming the importance of low-rank adaptation, separable convolutions, and the prototype mechanism for efficiency and accuracy. The work suggests a practical and scalable pathway for on-device structural defect analysis using KA-based representations, offering a compelling alternative to heavier CNN/Transformer-based models in resource-constrained settings.

Abstract

Semantic segmentation of structural defects in civil infrastructure remains challenging due to variable defect appearances, harsh imaging conditions, and significant class imbalance. Current deep learning methods, despite their effectiveness, typically require millions of parameters, rendering them impractical for real-time inspection systems. We introduce KARMA (Kolmogorov-Arnold Representation Mapping Architecture), a highly efficient semantic segmentation framework that models complex defect patterns through compositions of one-dimensional functions rather than conventional convolutions. KARMA features three technical innovations: (1) a parameter-efficient Tiny Kolmogorov-Arnold Network (TiKAN) module leveraging low-rank factorization for KAN-based feature transformation; (2) an optimized feature pyramid structure with separable convolutions for multi-scale defect analysis; and (3) a static-dynamic prototype mechanism that enhances feature representation for imbalanced classes. Extensive experiments on benchmark infrastructure inspection datasets demonstrate that KARMA achieves competitive or superior mean IoU performance compared to state-of-the-art approaches, while using significantly fewer parameters (0.959M vs. 31.04M, a 97% reduction). Operating at 0.264 GFLOPS, KARMA maintains inference speeds suitable for real-time deployment, enabling practical automated infrastructure inspection systems without compromising accuracy. The source code can be accessed at the following URL: https://github.com/faeyelab/karma.

Paper Structure

This paper contains 46 sections, 16 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: KARMA architecture overview showing the three main components: bottom-up pathway with InceptionSepConv blocks (c1-c5), TiKAN enhancement module at the deepest level (c5), and top-down pathway with feature fusion (p2-p5)
  • Figure 2: Performance-efficiency trade-offs for (a) S2DS and (b) CSDD datasets: parameter count vs. GFLOPS, colored by mIoU w/o bg.
  • Figure 3: Training performance comparison of different models: (a) Training IoU evolution over epochs, and (b) Training Loss convergence.
  • Figure 4: Runtime performance vs. accuracy trade-off comparison showing inference time (ms) vs. mIoU (excluding background) for semantic segmentation models on 512×512 images. Bubble size indicates GPU memory usage. Colors represent architecture types: CNNs (blue), efficient models (orange), transformers (green), KAN-based methods (red), and KARMA variants (pink).
  • Figure 5: Experimental validation of KARMA deployment on a real-world hardware platform.