Unstable Grounds for Beautiful Trees? Testing the Robustness of Concept Translations in the Compilation of Multilingual Wordlists
Snee, David, Ciucci, Luca, Rubehn, Arne, van Dam, Kellen Parker, and List, Johann-Mattis (2025) Unstable Grounds for Beautiful Trees? Testing the Robustness of Concept Translations in the Compilation of Multilingual Wordlists. In: Proceedings of the 7th Workshop on Research in Computational Linguistic Typology and Multilingual NLP. pp. 16-28. From: SIGTYP 2025: 7th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, 1 August 2025, Vienna, Austria.
|
PDF (Published Version)
- Published Version
Available under License Creative Commons Attribution. Download (793kB) | Preview |
Abstract
Multilingual wordlists play a crucial role in comparative linguistics. While many studies have been carried out to test the power of computational methods for language subgrouping or divergence time estimation, few studies have put the data upon which these studies are based to a rigorous test. Here, we conduct a first experiment that tests the robustness of concept translation as an integral part of the compilation of multilingual wordlists. Investigating the variation in concept translations in independently compiled wordlists from 10 dataset pairs covering 9 different language families, we find that on average, only 83% of all translations yield the same word form, while identical forms in terms of phonetic transcriptions can only be found in 23% of all cases. Our findings can prove important when trying to assess the uncertainty of phylogenetic studies and the conclusions derived from them.
Item ID: | 86614 |
---|---|
Item Type: | Conference Item (Research - E1) |
ISBN: | 979-8-89176-281-7 |
Related URLs: | |
Copyright Information: | © 2025 Association for Computational Linguistics. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. |
Date Deposited: | 12 Aug 2025 02:23 |
FoR Codes: | 47 LANGUAGE, COMMUNICATION AND CULTURE > 4704 Linguistics > 470406 Historical, comparative and typological linguistics @ 60% 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing @ 40% |
SEO Codes: | 28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280115 Expanding knowledge in the information and computing sciences @ 30% 28 EXPANDING KNOWLEDGE > 2801 Expanding knowledge > 280116 Expanding knowledge in language, communication and culture @ 70% |
Downloads: |
Total: 2 Last 12 Months: 2 |
More Statistics |