Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Investigating Attention Mechanism for Page Object Detection in Document Images
Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany.
Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; Mindgarage, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
German Research Institute for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0003-4029-6574
Show others and affiliations
2022 (English)In: Applied Sciences, E-ISSN 2076-3417, Vol. 12, no 15, article id 7486Article in journal (Refereed) Published
Abstract [en]

Page object detection in scanned document images is a complex task due to varying document layouts and diverse page objects. In the past, traditional methods such as Optical Character Recognition (OCR)-based techniques have been employed to extract textual information. However, these methods fail to comprehend complex page objects such as tables and figures. This paper addresses the localization problem and classification of graphical objects that visually summarize vital information in documents. Furthermore, this work examines the benefit of incorporating attention mechanisms in different object detection networks to perform page object detection on scanned document images. The model is designed with a Pytorch-based framework called Detectron2. The proposed pipelines can be optimized end-to-end and exhaustively evaluated on publicly available datasets such as DocBank, PublayNet, and IIIT-AR-13K. The achieved results reflect the effectiveness of incorporating the attention mechanism for page object detection in documents.

Place, publisher, year, edition, pages
MDPI, 2022. Vol. 12, no 15, article id 7486
Keywords [en]
attention mechanism, page object detection, transfer learning, document image analysis
National Category
Software Engineering
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-92804DOI: 10.3390/app12157486ISI: 000839264500001Scopus ID: 2-s2.0-85136993776OAI: oai:DiVA.org:ltu-92804DiVA, id: diva2:1693014
Funder
EU, Horizon 2020, 883293 INFINITY
Note

Validerad;2022;Nivå 2;2022-09-05 (hanlid)

Available from: 2022-09-05 Created: 2022-09-05 Last updated: 2022-09-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Liwicki, Marcus

Search in DiVA

By author/editor
Liwicki, Marcus
By organisation
Embedded Internet Systems Lab
In the same journal
Applied Sciences
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 49 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf