FL-APU: A Software Architecture to Ease Practical Implementation of Cross-Silo Federated Learning
F. Stricker, J. A. Peregrina, D. Bermbach, C. Zirpins
TL;DR
This work tackles the practical gap in cross-silo federated learning by proposing FL-APU, a scenario-based software architecture tailored for inter-organizational collaboration. It defines governance, authentication, and provenance-centric components across two main systems—the FL Server and FL Client—along with role models, metadata management, and secure communication to support production-ready, privacy-preserving training. Key contributions include a detailed architecture with governance cockpit, data validation, and a SAAM-based scenario evaluation that demonstrates the design can support real-world cross-silo FL workflows in critical infrastructure contexts. The work aims to enable trustworthy, auditable, and negotiable FL processes among competing organizations, paving the way for practical adoption and future field studies.
Abstract
Federated Learning (FL) is an upcoming technology that is increasingly applied in real-world applications. Early applications focused on cross-device scenarios, where many participants with limited resources train machine learning (ML) models together, e.g., in the case of Google's GBoard. Contrarily, cross-silo scenarios have only few participants but with many resources, e.g., in the healthcare domain. Despite such early efforts, FL is still rarely used in practice and best practices are, hence, missing. For new applications, in our case inter-organizational cross-silo applications, overcoming this lack of role models is a significant challenge. In order to ease the use of FL in real-world cross-silo applications, we here propose a scenario-based architecture for the practical use of FL in the context of multiple companies collaborating to improve the quality of their ML models. The architecture emphasizes the collaboration between the participants and the FL server and extends basic interactions with domain-specific features. First, it combines governance with authentication, creating an environment where only trusted participants can join. Second, it offers traceability of governance decisions and tracking of training processes, which are also crucial in a production environment. Beyond presenting the architectural design, we analyze requirements for the real-world use of FL and evaluate the architecture with a scenario-based analysis method.
