Table of Contents
Fetching ...

ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing

Suhyeon Yoo, Khai N. Truong, Young-Ho Kim

TL;DR

ELMI addresses the challenge of translating song lyrics into sign language by providing a line-by-line glossing tool augmented with LLM-driven guidance and per-line conversational AI. Built on formative insights about semantic, syntactic, expressive, and rhythmic translation, ELMI combines synchronized lyric-video playback, mood/performance guidance, and an AI chat tailored to meaning, glossing, emoting, and timing. In a user study with 13 song-signers, ELMI supported nuanced gloss creation, improved translation confidence, and offered a sense of ownership, while revealing concerns about cultural sensitivity, accuracy, and the need for broader context support. The work highlights design implications for culturally aware, accessible sign-language translation tools and points to future enhancements such as multiline contexts, richer ASL resources, and participatory development with Deaf communities.

Abstract

d/Deaf and hearing song-signers have become prevalent across video-sharing platforms, but translating songs into sign language remains cumbersome and inaccessible. Our formative study revealed the challenges song-signers face, including semantic, syntactic, expressive, and rhythmic considerations in translations. We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. ELMI enables users to edit glosses line-by-line, with real-time synced lyric and music video snippets. Users can also chat with a large language model-driven AI to discuss meaning, glossing, emoting, and timing. Through an exploratory study with 13 song-signers, we examined how ELMI facilitates their workflows and how song-signers leverage and receive an LLM-driven chat for translation. Participants successfully adopted ELMI to song-signing, with active discussions throughout. They also reported improved confidence and independence in their translations, finding ELMI encouraging, constructive, and informative. We discuss research and design implications for accessible and culturally sensitive song-signing translation tools.

ELMI: Interactive and Intelligent Sign Language Translation of Lyrics for Song Signing

TL;DR

ELMI addresses the challenge of translating song lyrics into sign language by providing a line-by-line glossing tool augmented with LLM-driven guidance and per-line conversational AI. Built on formative insights about semantic, syntactic, expressive, and rhythmic translation, ELMI combines synchronized lyric-video playback, mood/performance guidance, and an AI chat tailored to meaning, glossing, emoting, and timing. In a user study with 13 song-signers, ELMI supported nuanced gloss creation, improved translation confidence, and offered a sense of ownership, while revealing concerns about cultural sensitivity, accuracy, and the need for broader context support. The work highlights design implications for culturally aware, accessible sign-language translation tools and points to future enhancements such as multiline contexts, richer ASL resources, and participatory development with Deaf communities.

Abstract

d/Deaf and hearing song-signers have become prevalent across video-sharing platforms, but translating songs into sign language remains cumbersome and inaccessible. Our formative study revealed the challenges song-signers face, including semantic, syntactic, expressive, and rhythmic considerations in translations. We present ELMI, an accessible song-signing tool that assists in translating lyrics into sign language. ELMI enables users to edit glosses line-by-line, with real-time synced lyric and music video snippets. Users can also chat with a large language model-driven AI to discuss meaning, glossing, emoting, and timing. Through an exploratory study with 13 song-signers, we examined how ELMI facilitates their workflows and how song-signers leverage and receive an LLM-driven chat for translation. Participants successfully adopted ELMI to song-signing, with active discussions throughout. They also reported improved confidence and independence in their translations, finding ELMI encouraging, constructive, and informative. We discuss research and design implications for accessible and culturally sensitive song-signing translation tools.
Paper Structure (51 sections, 8 figures, 9 tables)

This paper contains 51 sections, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Example Glossing for "BTS - Dynamite." Song-signers created glosses line-by-line, writing ASL glosses corresponding to ENG lyrics.
  • Figure 2: The lyric line translation pane. ELMI offers rich visual feedback to convey the song's timing; the user can check relative music position ⓐ at a verse level, as well as the line level ⓑ. While the music is being played, the corresponding lyric words highlight to enhance the user's sense of timing ⓒ. When the user is typing in the gloss, the system provides real-time suggestion of alternative translations in varied lengths ⓓ.
  • Figure 3: ELMI analyzes the lyrics in advance and marks noteworthy lines potentially challenging to translate ⓐ; In this case, an American basketball player, 'LeBron' may not be recognized by users unfamiliar with the US sports scene. When the user hovers over the annotation indicator ⓑ, it shows a tooltip that invites the user to the discussion. If the user starts a chat thread by clicking the indicator, the AI will start the discussion directly ⓒ.
  • Figure 4: A pipeline for lyric analysis, which is part of pre-processing a song when the user creates a new project. Given the reference lyrics and metadata about the song and the user preference Ⓐ, the pipeline chains four LLM inference modules (Ⓑ, Ⓓ, Ⓕ, and Ⓖ) to generate notes on potential challenges when translating specific lines Ⓒ, base gloss translation Ⓔ, performance guides for lines Ⓖ, and the longer and shorter versions of each gloss line Ⓘ.
  • Figure 5: Likert-scale ratings of the 4 discussion topics (1: not useful 5: extremely useful).
  • ...and 3 more figures