Parameter Aware Mamba Model for Multi-task Dense Prediction

Xinzhuo Yu; Yunzhi Zhuge; Sitong Gong; Lu Zhang; Pingping Zhang; Huchuan Lu

Parameter Aware Mamba Model for Multi-task Dense Prediction

Xinzhuo Yu, Yunzhi Zhuge, Sitong Gong, Lu Zhang, Pingping Zhang, Huchuan Lu

TL;DR

PAMM tackles global task interaction in multi-task dense prediction by fusing a parameter-aware Mamba block with Mixture-of-Experts in a decoder, augmented by a state-space formulation ($S4$) and a Multi-Directional Hilbert Scanning (MDHS) scheme. It introduces Task Priors via per-task parameters and priors, enabling task-specific properties to guide decoding. The architecture, built on a Vision Transformer backbone, achieves superior Delta_g on NYUD-v2 and PASCAL-Context, with extensive ablations confirming the contributions of MoE, priors, and MDHS. Overall, PAMM offers a principled, scalable approach for globally coherent, prior-guided, task-conditioned dense predictions in vision.

Abstract

Understanding the inter-relations and interactions between tasks is crucial for multi-task dense prediction. Existing methods predominantly utilize convolutional layers and attention mechanisms to explore task-level interactions. In this work, we introduce a novel decoder-based framework, Parameter Aware Mamba Model (PAMM), specifically designed for dense prediction in multi-task learning setting. Distinct from approaches that employ Transformers to model holistic task relationships, PAMM leverages the rich, scalable parameters of state space models to enhance task interconnectivity. It features dual state space parameter experts that integrate and set task-specific parameter priors, capturing the intrinsic properties of each task. This approach not only facilitates precise multi-task interactions but also allows for the global integration of task priors through the structured state space sequence model (S4). Furthermore, we employ the Multi-Directional Hilbert Scanning method to construct multi-angle feature sequences, thereby enhancing the sequence model's perceptual capabilities for 2D data. Extensive experiments on the NYUD-v2 and PASCAL-Context benchmarks demonstrate the effectiveness of our proposed method. Our code is available at https://github.com/CQC-gogopro/PAMM.

Parameter Aware Mamba Model for Multi-task Dense Prediction

TL;DR

Abstract

Parameter Aware Mamba Model for Multi-task Dense Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)