MIDAS: Mosaic Input-Specific Differentiable Architecture Search
Konstanty Subbotko
TL;DR
MIDAS addresses instability and lack of input specificity in differentiable NAS by turning architecture parameters into input-conditioned decisions via patchwise self-attention. It introduces a mosaic architecture that localizes per-input choices and a topology-aware, parameter-free search space to select edge pairs, while maintaining DARTS-like efficiency. Across NAS-Bench-201, DARTS, and RDARTS, MIDAS achieves state-of-the-art or near-optimal results on CIFAR-10, CIFAR-100, and transfers to ImageNet, with analyses showing stable, unimodal, and class-aware architecture distributions. This approach offers a scalable, robust path to automated architecture search that leverages local context and topology without additional parameter overhead, with strong implications for hardware-aware NAS and future extension to broader search spaces and tasks.
Abstract
Differentiable Neural Architecture Search (NAS) provides efficient, gradient-based methods for automatically designing neural networks, yet its adoption remains limited in practice. We present MIDAS, a novel approach that modernizes DARTS by replacing static architecture parameters with dynamic, input-specific parameters computed via self-attention. To improve robustness, MIDAS (i) localizes the architecture selection by computing it separately for each spatial patch of the activation map, and (ii) introduces a parameter-free, topology-aware search space that models node connectivity and simplifies selecting the two incoming edges per node. We evaluate MIDAS on the DARTS, NAS-Bench-201, and RDARTS search spaces. In DARTS, it reaches 97.42% top-1 on CIFAR-10 and 83.38% on CIFAR-100. In NAS-Bench-201, it consistently finds globally optimal architectures. In RDARTS, it sets the state of the art on two of four search spaces on CIFAR-10. We further analyze why MIDAS works, showing that patchwise attention improves discrimination among candidate operations, and the resulting input-specific parameter distributions are class-aware and predominantly unimodal, providing reliable guidance for decoding.
