Table of Contents
Fetching ...

StAyaL | Multilingual Style Transfer

Karishma Thakrar, Katrina Lawrence, Kyle Howard

TL;DR

This work addresses cross-linguistic preservation of speaker style in text by learning language-agnostic style profiles from as few as 100 lines. It combines data augmentation with stylistically consistent sources, contrastive learning via Siamese networks, and a Random Forest reclassifier to produce mean-pooled style embeddings that transfer across languages. The approach yields quantitative gains (e.g., RFC accuracy ~0.75, recall ~0.75) and qualitative insights, while identifying language biases and outlining avenues for decoding embeddings into generated text. The framework has potential implications for personalized multilingual communication and group-level style profiling, though future work is needed to fully eliminate language-specific biases and enable robust cross-language generation from style embeddings.

Abstract

Stylistic text generation plays a vital role in enhancing communication by reflecting the nuances of individual expression. This paper presents a novel approach for generating text in a specific speaker's style across different languages. We show that by leveraging only 100 lines of text, an individuals unique style can be captured as a high-dimensional embedding, which can be used for both text generation and stylistic translation. This methodology breaks down the language barrier by transferring the style of a speaker between languages. The paper is structured into three main phases: augmenting the speaker's data with stylistically consistent external sources, separating style from content using machine learning and deep learning techniques, and generating an abstract style profile by mean pooling the learned embeddings. The proposed approach is shown to be topic-agnostic, with test accuracy and F1 scores of 74.9% and 0.75, respectively. The results demonstrate the potential of the style profile for multilingual communication, paving the way for further applications in personalized content generation and cross-linguistic stylistic transfer.

StAyaL | Multilingual Style Transfer

TL;DR

This work addresses cross-linguistic preservation of speaker style in text by learning language-agnostic style profiles from as few as 100 lines. It combines data augmentation with stylistically consistent sources, contrastive learning via Siamese networks, and a Random Forest reclassifier to produce mean-pooled style embeddings that transfer across languages. The approach yields quantitative gains (e.g., RFC accuracy ~0.75, recall ~0.75) and qualitative insights, while identifying language biases and outlining avenues for decoding embeddings into generated text. The framework has potential implications for personalized multilingual communication and group-level style profiling, though future work is needed to fully eliminate language-specific biases and enable robust cross-language generation from style embeddings.

Abstract

Stylistic text generation plays a vital role in enhancing communication by reflecting the nuances of individual expression. This paper presents a novel approach for generating text in a specific speaker's style across different languages. We show that by leveraging only 100 lines of text, an individuals unique style can be captured as a high-dimensional embedding, which can be used for both text generation and stylistic translation. This methodology breaks down the language barrier by transferring the style of a speaker between languages. The paper is structured into three main phases: augmenting the speaker's data with stylistically consistent external sources, separating style from content using machine learning and deep learning techniques, and generating an abstract style profile by mean pooling the learned embeddings. The proposed approach is shown to be topic-agnostic, with test accuracy and F1 scores of 74.9% and 0.75, respectively. The results demonstrate the potential of the style profile for multilingual communication, paving the way for further applications in personalized content generation and cross-linguistic stylistic transfer.
Paper Structure (27 sections, 2 equations, 5 figures, 1 table)

This paper contains 27 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Flow chart of the sequential process of the paper.
  • Figure 2: Results of the trained single branch Siamese Network on the contrastive learning pairs. There was a test loss of 0.1684 and a recall of 0.90.
  • Figure 3: These plots show how the positive and negative pairs are clustered before and after training the Siamese Network on the contrastive pairs. In the learned embedding space, there is a separation between the positive and negative pairs.
  • Figure 4: The learned embeddings from the Siamese Network are passed through a Random Forest Classifier to further separate the positive and negative pairs. The plot on the left is a projection of the embeddings before being passed through the classifier. The plot on the right is a projection of the embeddings after they have been passed through the classifier. More distinct groupings (i.e. better separation) can be observed in the second plot.
  • Figure 5: Style profiles of each speaker and in each language in the learned embedding space.