Frequency Composition for Compressed and Domain-Adaptive Neural Networks
Yoojin Kwon, Hongjun Suh, Wooseok Lee, Taesik Gong, Songyi Han, Hyung-Sin Kim
TL;DR
The paper tackles the dual challenge of resource-constrained on-device inference and dynamic domain shifts by introducing CoDA, a frequency-aware framework that unifies compression and domain adaptation. It trains quantized models on low-frequency content using LFC QAT and refines them at test time with Frequency-Aware BN (FABN) that leverages full-frequency features to adapt to target domains. The approach yields substantial gains on CIFAR10-C and ImageNet-C across architectures and bitwidths, while achieving 4–16x model compression, and remains compatible with standard QAT and TTA methods. The results establish that separating learning/normalization by frequency components improves robustness to domain shifts and enables effective continual adaptation for on-device systems. Overall, CoDA demonstrates a practical pathway to deploy highly compressed, domain-resilient neural networks in real-world, dynamic environments.
Abstract
Modern on-device neural network applications must operate under resource constraints while adapting to unpredictable domain shifts. However, this combined challenge-model compression and domain adaptation-remains largely unaddressed, as prior work has tackled each issue in isolation: compressed networks prioritize efficiency within a fixed domain, whereas large, capable models focus on handling domain shifts. In this work, we propose CoDA, a frequency composition-based framework that unifies compression and domain adaptation. During training, CoDA employs quantization-aware training (QAT) with low-frequency components, enabling a compressed model to selectively learn robust, generalizable features. At test time, it refines the compact model in a source-free manner (i.e., test-time adaptation, TTA), leveraging the full-frequency information from incoming data to adapt to target domains while treating high-frequency components as domain-specific cues. LFC are aligned with the trained distribution, while HFC unique to the target distribution are solely utilized for batch normalization. CoDA can be integrated synergistically into existing QAT and TTA methods. CoDA is evaluated on widely used domain-shift benchmarks, including CIFAR10-C and ImageNet-C, across various model architectures. With significant compression, it achieves accuracy improvements of 7.96%p on CIFAR10-C and 5.37%p on ImageNet-C over the full-precision TTA baseline.
