Table of Contents
Fetching ...

Applied Federated Learning: Improving Google Keyboard Query Suggestions

Timothy Yang, Galen Andrew, Hubert Eichner, Haicheng Sun, Wei Li, Nicholas Kong, Daniel Ramage, Françoise Beaufays

TL;DR

The paper demonstrates an end-to-end application of federated learning to train a triggering model on-device that filters Gboard query suggestions, improving click-through while preserving user privacy. It details a two-stage architecture with a server-trained baseline and an FL-trained triggering model, and it characterizes on-device training dynamics, including diurnal effects and deployment-skew. Live results show CTR improvements consistent with training predictions, underscoring FL's viability for production, privacy-preserving keyboard features. The work provides a practical blueprint for deploying FL in large-scale mobile systems and discusses debugging techniques without access to raw training data.

Abstract

Federated learning is a distributed form of machine learning where both the training data and model training are decentralized. In this paper, we use federated learning in a commercial, global-scale setting to train, evaluate and deploy a model to improve virtual keyboard search suggestion quality without direct access to the underlying user data. We describe our observations in federated training, compare metrics to live deployments, and present resulting quality increases. In whole, we demonstrate how federated learning can be applied end-to-end to both improve user experiences and enhance user privacy.

Applied Federated Learning: Improving Google Keyboard Query Suggestions

TL;DR

The paper demonstrates an end-to-end application of federated learning to train a triggering model on-device that filters Gboard query suggestions, improving click-through while preserving user privacy. It details a two-stage architecture with a server-trained baseline and an FL-trained triggering model, and it characterizes on-device training dynamics, including diurnal effects and deployment-skew. Live results show CTR improvements consistent with training predictions, underscoring FL's viability for production, privacy-preserving keyboard features. The work provides a practical blueprint for deploying FL in large-scale mobile systems and discusses debugging techniques without access to raw training data.

Abstract

Federated learning is a distributed form of machine learning where both the training data and model training are decentralized. In this paper, we use federated learning in a commercial, global-scale setting to train, evaluate and deploy a model to improve virtual keyboard search suggestion quality without direct access to the underlying user data. We describe our observations in federated training, compare metrics to live deployments, and present resulting quality increases. In whole, we demonstrate how federated learning can be applied end-to-end to both improve user experiences and enhance user privacy.

Paper Structure

This paper contains 15 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Architecture overview. Inference and training are on-device; model updates are sent to the server during training rounds and trained models are deployed manually to clients.
  • Figure 2: Setup of the baseline model (traditionally server trained) with the triggering model (federated trained). The baseline model generates candidates and the triggering model decides whether to show the candidate.
  • Figure 3: Round completion over time and round completion rate over time, times are in PST. Rounds progress faster at night when more devices are charging and on an unmetered network.
  • Figure 4: Eval loss and training example count over time, times are in PST, hour ranges inclusive. Training example count is highest in the evening as more devices are available. In contrast, eval loss is highest during the day when few devices are available and those available represent a skewed population.
  • Figure 5: Train and eval loss of the logistic regression triggering model over rounds (bucketed to 100 rounds).
  • ...and 1 more figures