Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust Scene Text Detection for Partially Annotated Training Data
Indian Institute of Technology, Roorkee, India.ORCID iD: 0000-0001-7611-6462
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0001-8532-0895
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.ORCID iD: 0000-0003-4029-6574
Indian Institute of Technology, Roorkee, India.ORCID iD: 0000-0002-5735-5254
2022 (English)In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 32, no 12, p. 8635-8645Article in journal (Refereed) Published
Abstract [en]

This article analyzed the impact of training data containing un-annotated text instances, i.e., partial annotation in scene text detection, and proposed a text region refinement approach to address it. Scene text detection is a problem that has attracted the attention of the research community for decades. Impressive results have been obtained for fully supervised scene text detection with recent deep learning approaches. These approaches, however, need a vast amount of completely labeled datasets, and the creation of such datasets is a challenging and time-consuming task. Research literature lacks the analysis of the partial annotation of training data for scene text detection. We have found that the performance of the generic scene text detection method drops significantly due to the partial annotation of training data. We have proposed a text region refinement method that provides robustness against the partially annotated training data in scene text detection. The proposed method works as a two-tier scheme. Text-probable regions are obtained in the first tier by applying hybrid loss that generates pseudo-labels to refine text regions in the second-tier during training. Extensive experiments have been conducted on a dataset generated from ICDAR 2015 by dropping the annotations with various drop rates and on a publicly available SVT dataset. The proposed method exhibits a significant improvement over the baseline and existing approaches for the partially annotated training data.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022. Vol. 32, no 12, p. 8635-8645
Keywords [en]
Partial annotation, Scene text detection, Text region refinement, Pseudo-labeling
National Category
Software Engineering Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-92556DOI: 10.1109/tcsvt.2022.3194835ISI: 000936985600043Scopus ID: 2-s2.0-85135736292OAI: oai:DiVA.org:ltu-92556DiVA, id: diva2:1688291
Note

Validerad;2023;Nivå 2;2023-04-18 (joosat);

Available from: 2022-08-18 Created: 2022-08-18 Last updated: 2023-09-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Saini, RajkumarLiwicki, Marcus

Search in DiVA

By author/editor
Keserwani, PrateekSaini, RajkumarLiwicki, MarcusRoy, Partha Pratim
By organisation
Embedded Internet Systems Lab
In the same journal
IEEE transactions on circuits and systems for video technology (Print)
Software EngineeringComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 93 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf