A Deep Learning based Arabic Script Recognition System: Benchmark on KHATShow others and affiliations
2020 (English)In: The International Arab Journal of Information Technology, ISSN 1683-3198, Vol. 17, no 3, p. 299-305
Article in journal (Refereed) Published
Abstract [en]
This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.
Place, publisher, year, edition, pages
Zarqa University, Jordan , 2020. Vol. 17, no 3, p. 299-305
Keywords [en]
Handwritten Arabic text recognition, deep learning, data augmentation
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-78876DOI: 10.34028/iajit/17/3/3ISI: 000529820700003Scopus ID: 2-s2.0-85086443300OAI: oai:DiVA.org:ltu-78876DiVA, id: diva2:1430307
Note
Validerad;2020;Nivå 2;2020-05-14 (alebob)
2020-05-142020-05-142020-06-29Bibliographically approved