3738394041424340 of 96
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
TaskComplexity: A Dataset for Task Complexity Classification with In-Context Learning, FLAN-T5 and GPT-4o Benchmarks
College of Information Engineering, Al-Nahrain University, Baghdad, Iraq.
Software Engineering, Baghdad, Iraq.
Department of Cybersecurity Engineering, College of Information Engineering, Al-Nahrain University, Jadriya Baghdad, Iraq.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-7924-4953
2025 (English)In: International Conference on Machine Learning and Data Engineering, ICMLDE 2024 / [ed] Vijendra Singh; Kuan-Ching Li; Vijayan K. V K Asari; Rubén R.G González Crespo, Elsevier B.V. , 2025, p. 3713-3722Conference paper, Published paper (Refereed)
Abstract [en]

This paper addresses the challenge of classifying and assigning programming tasks to experts, a process that typically requires significant effort, time, and cost. To tackle this issue, a novel dataset1 containing a total of 4,112 programming tasks was created by extracting tasks from various websites. Web scraping techniques were employed to collect this dataset of programming problems systematically. Specific HTML tags were tracked to extract key elements of each issue, including the title, problem description, input/output, examples, problem class, and complexity score. Examples from the dataset are provided in the appendix to illustrate the variety and complexity of tasks included. The dataset’s effectiveness has been evaluated and benchmarked using two approaches; the first approach involved fine-tuning the FLAN-T5 small model on the dataset, while the second approach used in-context learning (ICL) with the GPT-4o mini. The performance was assessed using standard metrics: accuracy, recall, precision, and F1-score. The results indicated that in-context learning with GPT-4o-mini outperformed the FLAN-T5 model. 

Place, publisher, year, edition, pages
Elsevier B.V. , 2025. p. 3713-3722
Series
Procedia Computer Science, ISSN 1877-0509 ; 258
Keywords [en]
GPT-4o-mini, Flan-T5, task classification, in-context learning, Natural Language Processing (NLP), dataset creation
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-113963DOI: 10.1016/j.procs.2025.04.626Scopus ID: 2-s2.0-105007160276OAI: oai:DiVA.org:ltu-113963DiVA, id: diva2:1980059
Conference
3rd International conference on Machine Learning and Data Engineering (ICMLDE 2024), Dehradun, India, November 28-29, 2024
Note

Full text license: CC BY-NC-ND

Available from: 2025-07-01 Created: 2025-07-01 Last updated: 2025-07-01Bibliographically approved

Open Access in DiVA

fulltext(1537 kB)11 downloads
File information
File name FULLTEXT01.pdfFile size 1537 kBChecksum SHA-512
4d64bd3913796ed89ebc67cb25caf47e36ecbf10aa421575fe0841f9a0176962e4b2c82b43c8e0c32f81394d065acd29eaedce7ec160df992373d6ef364f4755
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Al-Azzawi, Sana Sabah

Search in DiVA

By author/editor
Al-Azzawi, Sana Sabah
By organisation
Embedded Internet Systems Lab
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 11 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 225 hits
3738394041424340 of 96
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf