Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
LOWRECORP: the Low-Resource NLG Corpus Building Challenge
Allen Institute of AI.
Edinburgh Napier University, UK.
Edinburgh Napier University, UK.
The Alan Turing Institute, UK.
Show others and affiliations
2023 (English)In: Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges / [ed] Simon Mille, Association for Computational Linguistics , 2023, p. 1-9Conference paper, Published paper (Refereed)
Abstract [en]

Most languages in the world do not have sufficient data available to develop neural-network-based natural language generation (NLG) systems. To alleviate this resource scarcity, we propose a novel challenge for the NLG community: low-resource language corpus development (LOWRECORP). We present an innovative framework to collect a single dataset with dual tasks to maximize the efficiency of data collection efforts and respect language consultant time. Specifically, we focus on a text-chat-based interface for two generation tasks – conversational response generation grounded in a source document and/or image and dialogue summarization (from the former task). The goal of this shared task is to collectively develop grounded datasets for local and low-resourced languages. To enable data collection, we make available web-based software that can be used to collect these grounded conversations and summaries. Submissions will be assessed for the size, complexity, and diversity of the corpora to ensure quality control of the datasets as well as any enhancements to the interface or novel approaches to grounding conversations.

Place, publisher, year, edition, pages
Association for Computational Linguistics , 2023. p. 1-9
National Category
Natural Language Processing
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-110853DOI: 10.18653/v1/2023.inlg-genchal.1Scopus ID: 2-s2.0-105016340598OAI: oai:DiVA.org:ltu-110853DiVA, id: diva2:1916422
Conference
16th International Natural Language Generation Conference: Generation Challenges, Prague, Czechia, September 11–15, 2023
Note

Funder: EPSRC (EP/T024917/1);

ISBN for host publication:  979-8-89176-003-5;

Available from: 2024-11-27 Created: 2024-11-27 Last updated: 2025-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Adewumi, Oluwatosin

Search in DiVA

By author/editor
Adewumi, Oluwatosin
By organisation
Embedded Internet Systems Lab
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 17 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf