Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Comprehensive Study of ImageNet Pre-Training for Historical Document Image Analysis
Document Image and Voice Analysis Group (DIVA), University of Fribourg, Switzerland.
Document Image and Voice Analysis Group (DIVA), University of Fribourg, Switzerland.
Document Image and Voice Analysis Group (DIVA), University of Fribourg, Switzerland.
Document Image and Voice Analysis Group (DIVA), University of Fribourg, Switzerland.
Show others and affiliations
2019 (English)In: The 15th IAPR International Conference on Document Analysis and Recognition: ICDAR 2019, IEEE, 2019, p. 720-725Conference paper, Published paper (Other academic)
Abstract [en]

Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval.

Place, publisher, year, edition, pages
IEEE, 2019. p. 720-725
Series
International Conference on Document Analysis and Recognition, ISSN 1520-5363, E-ISSN 2379-2140
Keywords [en]
deep learning, historical document analysis, transfer learning, pre training
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-78684DOI: 10.1109/ICDAR.2019.00120Scopus ID: 2-s2.0-85079909016OAI: oai:DiVA.org:ltu-78684DiVA, id: diva2:1426627
Conference
The 15th IAPR International Conference on Document Analysis and Recognition (ICDAR 2019), 20-25 September, 2019, Sydney, Australia
Note

ISBN för värdpublikation: 978-1-7281-3014-9, 978-1-7281-3015-6

Available from: 2020-04-27 Created: 2020-04-27 Last updated: 2020-04-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Liwicki, Marcus

Search in DiVA

By author/editor
Liwicki, Marcus
By organisation
Embedded Internet Systems Lab
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 57 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf