Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis
Universite de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France; Masakhane NLP.
Intron Health; Masakhane NLP.
Intron Health; Masakhane NLP.
Amazethu Research; Masakhane NLP.
Show others and affiliations
2024 (English)In: Interspeech 2024 / [ed] Itshak Lapidot, Sharon Gannot, International Speech Communication Association , 2024, p. 1855-1859Conference paper, Published paper (Refereed)
Abstract [en]

Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African voices and personas are under-represented in these systems. As speech synthesis becomes increasingly democratized, it is desirable to increase the representation of African English accents. We present Afro-TTS, the first pan-African accented English speech synthesis system able to generate speech in 86 African accents, with 1000 personas representing the rich phonological diversity across the continent for downstream application in Education, Public Health, and Automated Content Creation. Speaker interpolation retains naturalness and accentedness, enabling the creation of new voices.

Place, publisher, year, edition, pages
International Speech Communication Association , 2024. p. 1855-1859
Keywords [en]
text-to-speech, African-accented TTS, accented speech, multi-accent TTS, multi-speaker TTS
National Category
Natural Language Processing Specific Languages Human Computer Interaction
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-110840DOI: 10.21437/interspeech.2024-2281ISI: 001331850101200Scopus ID: 2-s2.0-85207856467OAI: oai:DiVA.org:ltu-110840DiVA, id: diva2:1916311
Conference
Interspeech 2024, 1-5 September 2024, Kos, Greece,
Available from: 2024-11-27 Created: 2024-11-27 Last updated: 2025-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopusConference webpage

Authority records

Adewumi, Tosin

Search in DiVA

By author/editor
Adewumi, Tosin
By organisation
Embedded Internet Systems Lab
Natural Language ProcessingSpecific LanguagesHuman Computer Interaction

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 76 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf