Verifiability and Privacy in Federated Learning through Context-Hiding Multi-Key Homomorphic Authenticators
Simone Bottoni, Giulio Zizzo, Stefano Braghin, Alberto Trombetta
TL;DR
The paper tackles the risk of a dishonest aggregator in federated learning by proposing a verifiable FL protocol that preserves client privacy. It combines a masking-based secure aggregation scheme with a Context-Hiding Identity-based Multi-Key Linearly Homomorphic Authenticator, enabling clients to verify the aggregated result without exposing their updates. The approach demonstrates scalability to models with millions of parameters and reports low overhead for authenticators: around $2.8$ seconds for authenticator generation per client on a 1M-parameter model, roughly $10$ ms for per-client verification, and a few milliseconds for server-side aggregation even with thousands of clients, alongside modest communication costs. The work offers a practical, open-source solution to ensure aggregator integrity in FL, with potential extensions to handle dropout clients and broader deployments.
Abstract
Federated Learning has rapidly expanded from its original inception to now have a large body of research, several frameworks, and sold in a variety of commercial offerings. Thus, its security and robustness is of significant importance. There are many algorithms that provide robustness in the case of malicious clients. However, the aggregator itself may behave maliciously, for example, by biasing the model or tampering with the weights to weaken the models privacy. In this work, we introduce a verifiable federated learning protocol that enables clients to verify the correctness of the aggregators computation without compromising the confidentiality of their updates. Our protocol uses a standard secure aggregation technique to protect individual model updates with a linearly homomorphic authenticator scheme that enables efficient, privacy-preserving verification of the aggregated result. Our construction ensures that clients can detect manipulation by the aggregator while maintaining low computational overhead. We demonstrate that our approach scales to large models, enabling verification over large neural networks with millions of parameters.
