Dual-Domain Fusion for Semi-Supervised Learning
Tuomas Jalonen, Mohammad Al-Sa'd, Serkan Kiranyaz, Moncef Gabbouj
TL;DR
This work tackles the challenge of limited labeled data for time-series classification by introducing Dual-Domain Fusion (DDF), a model-agnostic semi-supervised framework that leverages both time-domain and time-frequency representations through a tri-model architecture. During training, separate time-domain and time-frequency classifiers collaborate with a fusion module to produce high-quality pseudo-labels, improving learning from unlabeled data, while inference remains efficient by using only the time-domain path. The approach is demonstrated on two bearing fault datasets (KAIST and SQV), where DDF achieves substantial accuracy gains (8–46%) over strong SSL baselines across varying amounts of unlabeled data and noise levels. The deployment strategy further enables cloud-based training with edge-friendly inference, making DDF particularly suitable for real-time fault diagnosis in resource-constrained environments. Overall, DDF provides a general, deployment-aware strategy to harness cross-domain information for SSL in time-series applications.
Abstract
Labeled time-series data is often expensive and difficult to obtain, making it challenging to train accurate machine learning models for real-world applications such as anomaly detection or fault diagnosis. The scarcity of labeled samples limits model generalization and leaves valuable unlabeled data underutilized. We propose Dual-Domain Fusion (DDF), a new model-agnostic semi-supervised learning (SSL) framework applicable to any time-series signal. DDF performs dual-domain training by combining the one-dimensional time-domain signals with their two-dimensional time-frequency representations and fusing them to maximize learning performance. Its tri-model architecture consists of time-domain, time-frequency, and fusion components, enabling the model to exploit complementary information across domains during training. To support practical deployment, DDF maintains the same inference cost as standard time-domain models by discarding the time-frequency and fusion branches at test time. Experimental results on two public fault diagnosis datasets demonstrate substantial accuracy improvements of 8-46% over widely used SSL methods FixMatch, MixMatch, Mean Teacher, Adversarial Training, and Self-training. These results show that DDF provides an effective and generalizable strategy for semi-supervised time-series classification.
