A Comprehensive Study of ImageNet Pre-Training for Historical Document Image AnalysisShow others and affiliations
2019 (English)In: The 15th IAPR International Conference on Document Analysis and Recognition: ICDAR 2019, IEEE, 2019, p. 720-725Conference paper, Published paper (Other academic)
Abstract [en]
Automatic analysis of scanned historical documents comprises a wide range of image analysis tasks, which are often challenging for machine learning due to a lack of human-annotated learning samples. With the advent of deep neural networks, a promising way to cope with the lack of training data is to pre-train models on images from a different domain and then fine-tune them on historical documents. In the current research, a typical example of such cross-domain transfer learning is the use of neural networks that have been pre-trained on the ImageNet database for object recognition. It remains a mostly open question whether or not this pre-training helps to analyse historical documents, which have fundamentally different image properties when compared with ImageNet. In this paper, we present a comprehensive empirical survey on the effect of ImageNet pre-training for diverse historical document analysis tasks, including character recognition, style classification, manuscript dating, semantic segmentation, and content-based retrieval. While we obtain mixed results for semantic segmentation at pixel-level, we observe a clear trend across different network architectures that ImageNet pre-training has a positive effect on classification as well as content-based retrieval.
Place, publisher, year, edition, pages
IEEE, 2019. p. 720-725
Series
International Conference on Document Analysis and Recognition, ISSN 1520-5363, E-ISSN 2379-2140
Keywords [en]
deep learning, historical document analysis, transfer learning, pre training
National Category
Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-78684DOI: 10.1109/ICDAR.2019.00120Scopus ID: 2-s2.0-85079909016OAI: oai:DiVA.org:ltu-78684DiVA, id: diva2:1426627
Conference
The 15th IAPR International Conference on Document Analysis and Recognition (ICDAR 2019), 20-25 September, 2019, Sydney, Australia
Note
ISBN för värdpublikation: 978-1-7281-3014-9, 978-1-7281-3015-6
2020-04-272020-04-272020-04-27Bibliographically approved