Active ML for 6G: Towards Efficient Data Generation, Acquisition, and Annotation
Omar Alhussein, Ning Zhang, Sami Muhaidat, Weihua Zhuang
TL;DR
The paper addresses the data bottleneck in 6G network ML by advocating a network-centric active learning framework that jointly optimizes data acquisition and labeling. It proposes integrating active learning with generative AI and digital twins to generate informative synthetic data and diverse scenarios, improving generalization and reducing labeling costs. A mmWave throughput case study demonstrates data-efficient gains, and the work discusses a broad set of 6G use cases and future research directions, including distributed learning and human-in-the-loop considerations. The approach promises more adaptable, efficient, and intelligent 6G networks with practical impact on data efficiency, maintenance, and URLLC performance.
Abstract
This paper explores the integration of active machine learning (ML) for 6G networks, an area that remains under-explored yet holds potential. Unlike passive ML systems, active ML can be made to interact with the network environment. It actively selects informative and representative data points for training, thereby reducing the volume of data needed while accelerating the learning process. While active learning research mainly focuses on data annotation, we call for a network-centric active learning framework that considers both annotation (i.e., what is the label) and data acquisition (i.e., which and how many samples to collect). Moreover, we explore the synergy between generative artificial intelligence (AI) and active learning to overcome existing limitations in both active learning and generative AI. This paper also features a case study on a mmWave throughput prediction problem to demonstrate the practical benefits and improved performance of active learning for 6G networks. Furthermore, we discuss how the implications of active learning extend to numerous 6G network use cases. We highlight the potential of active learning based 6G networks to enhance computational efficiency, data annotation and acquisition efficiency, adaptability, and overall network intelligence. We conclude with a discussion on challenges and future research directions for active learning in 6G networks, including development of novel query strategies, distributed learning integration, and inclusion of human- and machine-in-the-loop learning.
