Table of Contents
Fetching ...

FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training

Haoyuan Li, Mathias Funk, Jindong Wang, Aaqib Saeed

TL;DR

The paper tackles the high annotation costs and communication overhead of Federated Active Learning in cross-silo settings. It introduces FAST, a two-pass framework that first uses a frozen foundation-model encoder for weak labeling and then a refinement pass with human annotation focusing on the most uncertain samples, thereby reducing the need for repeated sampling rounds. The approach is formalized with an FL objective and a detailed two-pass labeling pipeline that produces prototype-based weak labels, followed by uncertainty-driven refinement, and it demonstrates strong improvements—averaging $4.36\%$ accuracy gains and $8\times$ fewer communication rounds—across diverse natural and medical image datasets. The findings highlight the value of foundation-model representations for communication-efficient FAL and point to future enhancements in query strategies and post-label filtering for even greater efficiency.

Abstract

Federated Active Learning (FAL) has emerged as a promising framework to leverage large quantities of unlabeled data across distributed clients while preserving data privacy. However, real-world deployments remain limited by high annotation costs and communication-intensive sampling processes, particularly in a cross-silo setting, when clients possess substantial local datasets. This paper addresses the crucial question: What is the best practice to reduce communication costs in human-in-the-loop learning with minimal annotator effort? Existing FAL methods typically rely on iterative annotation processes that separate active sampling from federated updates, leading to multiple rounds of expensive communication and annotation. In response, we introduce FAST, a two-pass FAL framework that harnesses foundation models for weak labeling in a preliminary pass, followed by a refinement pass focused exclusively on the most uncertain samples. By leveraging representation knowledge from foundation models and integrating refinement steps into a streamlined workflow, FAST substantially reduces the overhead incurred by iterative active sampling. Extensive experiments on diverse medical and natural image benchmarks demonstrate that FAST outperforms existing FAL methods by an average of 4.36% while reducing communication rounds eightfold under a limited 5% labeling budget.

FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training

TL;DR

The paper tackles the high annotation costs and communication overhead of Federated Active Learning in cross-silo settings. It introduces FAST, a two-pass framework that first uses a frozen foundation-model encoder for weak labeling and then a refinement pass with human annotation focusing on the most uncertain samples, thereby reducing the need for repeated sampling rounds. The approach is formalized with an FL objective and a detailed two-pass labeling pipeline that produces prototype-based weak labels, followed by uncertainty-driven refinement, and it demonstrates strong improvements—averaging accuracy gains and fewer communication rounds—across diverse natural and medical image datasets. The findings highlight the value of foundation-model representations for communication-efficient FAL and point to future enhancements in query strategies and post-label filtering for even greater efficiency.

Abstract

Federated Active Learning (FAL) has emerged as a promising framework to leverage large quantities of unlabeled data across distributed clients while preserving data privacy. However, real-world deployments remain limited by high annotation costs and communication-intensive sampling processes, particularly in a cross-silo setting, when clients possess substantial local datasets. This paper addresses the crucial question: What is the best practice to reduce communication costs in human-in-the-loop learning with minimal annotator effort? Existing FAL methods typically rely on iterative annotation processes that separate active sampling from federated updates, leading to multiple rounds of expensive communication and annotation. In response, we introduce FAST, a two-pass FAL framework that harnesses foundation models for weak labeling in a preliminary pass, followed by a refinement pass focused exclusively on the most uncertain samples. By leveraging representation knowledge from foundation models and integrating refinement steps into a streamlined workflow, FAST substantially reduces the overhead incurred by iterative active sampling. Extensive experiments on diverse medical and natural image benchmarks demonstrate that FAST outperforms existing FAL methods by an average of 4.36% while reducing communication rounds eightfold under a limited 5% labeling budget.

Paper Structure

This paper contains 14 sections, 8 equations, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of FAST. FAST is a communication-efficient FAL framework that employs a two-pass labeling strategy. In the first pass, foundation models perform weak sampling to identify informative data points with minimal communication overhead. In the subsequent pass, human annotators refine the labeled ground truth dataset by validating and correcting the sampled labels, ensuring high-quality annotations.
  • Figure 2: Experimental comparison of our method with existing approaches on CIFAR-10 and CIFAR-100 under a Non-IID data distribution. Other AL methods begin by randomly selecting 10% of the initial data, followed by 50 communication rounds of training after each AL sampling step until reaching a 35% labeling budget. In contrast, our method completes training at 100 rounds and achieves its highest performance (indicated by the grey line). Refer to Appendix Figure \ref{['exp:acc_vs_com_full']} for results over the entire 300-round training process.
  • Figure 3: Experimental comparison of our method with existing approaches on the CIFAR-10 and CIFAR-100 datasets under a Non-IID data distribution. For other active learning (AL) methods, the process begins by randomly selecting 10% of the initial data, followed by training with 50 communication rounds after each AL sampling step until a labeling budget of 35% is reached.
  • Figure 4: Performance of FAST across 10, 20, and 30 clients on CIFAR-10/100 under FedAvg with 150 FL rounds.