Table of Contents
Fetching ...

Now It Sounds Like You: Learning Personalized Vocabulary On Device

Sid Wang, Ashish Shenoy, Pierce Chuang, John Nguyen

TL;DR

The paper addresses on-device next-word prediction under strict memory and latency constraints by tackling OOV handling in a personalized FL setting with a closed vocabulary. It introduces OOV Expansion, a per-client OOV adapter integrated into a character-aware LSTM LM, which computes embeddings for top OOV words and fuses their scores with the base vocabulary while keeping all OOV data on-device. Empirical results on two FL benchmarks show up to $2.1\%$ absolute and $5.6\%$ relative EMR_3 gains (Reddit) and $1.0\%$ absolute and $2.5\%$ relative gains (StackOverflow), along with dramatic OOV-rate reductions ($>97.7\%$ and $>99.9\%$) and favorable parameter efficiency compared to baselines. The adapter-based approach proves essential for improving OOV understanding without increasing on-device memory, offering a privacy-preserving and computation-friendly path for personalized on-device language models. Limitations include the absence of subword modeling due to resource constraints and potential cold-start challenges when user history is sparse.

Abstract

In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary. OOV expansion significantly outperforms standard FL personalization methods on a set of common FL benchmarks.

Now It Sounds Like You: Learning Personalized Vocabulary On Device

TL;DR

The paper addresses on-device next-word prediction under strict memory and latency constraints by tackling OOV handling in a personalized FL setting with a closed vocabulary. It introduces OOV Expansion, a per-client OOV adapter integrated into a character-aware LSTM LM, which computes embeddings for top OOV words and fuses their scores with the base vocabulary while keeping all OOV data on-device. Empirical results on two FL benchmarks show up to absolute and relative EMR_3 gains (Reddit) and absolute and relative gains (StackOverflow), along with dramatic OOV-rate reductions ( and ) and favorable parameter efficiency compared to baselines. The adapter-based approach proves essential for improving OOV understanding without increasing on-device memory, offering a privacy-preserving and computation-friendly path for personalized on-device language models. Limitations include the absence of subword modeling due to resource constraints and potential cold-start challenges when user history is sparse.

Abstract

In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary. OOV expansion significantly outperforms standard FL personalization methods on a set of common FL benchmarks.
Paper Structure (13 sections, 4 equations, 5 figures, 2 tables)

This paper contains 13 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The quantile plot of word frequency for top 10k words from 2 datasets that demonstrates the long-tail phenomenon of OOV.
  • Figure 2: The mechanism flowchart of the OOV Expansion stage; here $\|$ denotes vector concatenation.
  • Figure 3: $\text{EMR}_3$ of OOV Expansion and two baselines on 2 datasets.
  • Figure 4: $\text{EMR}_3$ of two baselines before and after personalizations on each dataset.
  • Figure 5: $\text{EMR}_3$ and $\text{KEMR}_3$ of OOV expansion on 2 datasets, with and without adapter

Theorems & Definitions (1)

  • Remark 2.1