Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages
University College London, UK.
University College London, UK; Masakhane NLP.
University of Maryland, USA.
University College London, UK.
Show others and affiliations
2024 (English)In: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 / [ed] Duh K.; Gomez H.; Bethard S., Association for Computational Linguistics (ACL) , 2024, p. 5997-6023, article id 200463Conference paper, Published paper (Refereed)
Abstract [en]

Despite the recent progress on scaling multilingual machine translation (MT) to severalunder-resourced African languages, accuratelymeasuring this progress remains challenging,since evaluation is often performed on n-grammatching metrics such as BLEU, which typically show a weaker correlation with humanjudgments. Learned metrics such as COMEThave higher correlation; however, the lack ofevaluation data with human ratings for underresourced languages, complexity of annotationguidelines like Multidimensional Quality Metrics (MQM), and limited language coverageof multilingual encoders have hampered theirapplicability to African languages. In this paper, we address these challenges by creatinghigh-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AFRICOMET: COMETevaluation metrics for African languages byleveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-theart MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).

Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL) , 2024. p. 5997-6023, article id 200463
National Category
Language Technology (Computational Linguistics)
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-108639DOI: 10.18653/v1/2024.naacl-long.334Scopus ID: 2-s2.0-85199581086OAI: oai:DiVA.org:ltu-108639DiVA, id: diva2:1890571
Conference
2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024), Mexico City, Mexico, June 16-21, 2024
Note

Funder: UTTER (101070631); Portuguese Recovery and Resilience Plan (C645008882-00000055); Landmark Development Initiative Africa; European Commission; Fundação para a Ciência e a Tecnologia;

ISBN for host publication: 979-889176114-8; 

Fulltext license: CC BY Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License

Available from: 2024-08-20 Created: 2024-08-20 Last updated: 2024-11-27Bibliographically approved

Open Access in DiVA

fulltext(2724 kB)6 downloads
File information
File name FULLTEXT02.pdfFile size 2724 kBChecksum SHA-512
22115c3641fc4af95f993d2ed222e6457f4bf018565651d4f61e89a1d053548ead14541f02c31abfebe3dd6e8ea9a83cc22a9eb4127281e6ed68952a4352224e
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Adewumi, TosinMokayed, HamamAlkhaled, LamaAl-Azzawi, Sana

Search in DiVA

By author/editor
Adewumi, TosinMokayed, HamamAlkhaled, LamaAl-Azzawi, Sana
By organisation
Embedded Internet Systems LabSignals and Systems
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 6 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 61 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf