Toward Semi-Supervised Graphical Object Detection in Document Images Show others and affiliations
2022 (English) In: Future Internet, E-ISSN 1999-5903, Vol. 14, no 6, article id 176Article in journal (Refereed) Published
Abstract [en]
The graphical page object detection classifies and localizes objects such as Tables and Figures in a document. As deep learning techniques for object detection become increasingly successful, many supervised deep neural network-based methods have been introduced to recognize graphical objects in documents. However, these models necessitate a substantial amount of labeled data for the training process. This paper presents an end-to-end semi-supervised framework for graphical object detection in scanned document images to address this limitation. Our method is based on a recently proposed Soft Teacher mechanism that examines the effects of small percentage-labeled data on the classification and localization of graphical objects. On both the PubLayNet and the IIIT-AR-13K datasets, the proposed approach outperforms the supervised models by a significant margin in all labeling ratios (1%, 5%, and 10%). Furthermore, the 10% PubLayNet Soft Teacher model improves the average precision of Table, Figure, and List by +5.4,+1.2, and +3.2 points, respectively, with a similar total mAP as the Faster-RCNN baseline. Moreover, our model trained on 10% of IIIT-AR-13K labeled data beats the previous fully supervised method +4.5 points.
Place, publisher, year, edition, pages MDPI, 2022. Vol. 14, no 6, article id 176
Keywords [en]
graphical page objects, object detection, document image analysis, semi-supervised, soft teacher
National Category
Software Engineering Computer Sciences
Research subject Machine Learning
Identifiers URN: urn:nbn:se:ltu:diva-92096 DOI: 10.3390/fi14060176 ISI: 000816337400001 Scopus ID: 2-s2.0-85135256899 OAI: oai:DiVA.org:ltu-92096 DiVA, id: diva2:1681685
Note Validerad;2022;Nivå 2;2022-07-07 (sofila);
Funder: The European project INFINITY (grant no. 883293)
2022-07-072022-07-072023-08-03 Bibliographically approved