Words as Bridges: Exploring Computational Support for Cross-Disciplinary Translation Work
Calvin Bao, Yow-Ting Shiue, Marine Carpuat, Joel Chan
TL;DR
This work reframes cross-domain scholarly information seeking as translation work between domain-specific communities and tests a jargon-preserving, cross-domain embedding alignment approach. It develops a prototype cross-domain search engine that aligns domain-specific word embeddings using unsupervised methods (MUSE and VecMap) and validates the concept through two case studies with interdisciplinary researchers. Case Study 1 shows MUSE can yield novel yet relevant cross-domain mappings, while Case Study 2 finds VecMap mappings to be more reliable for qualitative exploration, with think-aloud assessments comparing against a GPT-4 baseline. The results suggest that separating domains and aligning them in a shared embedding space can reveal novel conceptual bridges, informing interface designs for cross-domain information seeking and guiding future multilingual NLP explorations in scholarly translation tasks.
Abstract
Scholars often explore literature outside of their home community of study. This exploration process is frequently hampered by field-specific jargon. Past computational work often focuses on supporting translation work by removing jargon through simplification and summarization; here, we explore a different approach that preserves jargon as useful bridges to new conceptual spaces. Specifically, we cast different scholarly domains as different language-using communities, and explore how to adapt techniques from unsupervised cross-lingual alignment of word embeddings to explore conceptual alignments between domain-specific word embedding spaces.We developed a prototype cross-domain search engine that uses aligned domain-specific embeddings to support conceptual exploration, and tested this prototype in two case studies. We discuss qualitative insights into the promises and pitfalls of this approach to translation work, and suggest design insights for future interfaces that provide computational support for cross-domain information seeking.
