Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
CFOR: Character-First Open-Set Text Recognition via Context-Free Learning
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. School of Computer and Communication Engineering University of Science and Technology Beijing Beijing, China.ORCID iD: 0000-0002-7353-0251
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China.ORCID iD: 0000-0002-6297-4500
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China.ORCID iD: 0000-0001-5986-029X
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China.
Show others and affiliations
2024 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 33, p. 6497-6507Article in journal (Refereed) Published
Abstract [en]

The open-set text recognition task is a generalized form of the (close-set) text recognition task, where the model is further challenged to spot and incrementally recognize novel characters not covered by the training data. Novel characters also indicate that the language model of the training set is biased from the “real-world”. In this work, we alleviate the confounding effect of such biases by learning from individual character representations isolated from their context. Specifically, we propose a Character-First Open-Set Text Recognition framework that cotrains the feature extractor with two context-free learning tasks. First, a Context Isolation Learning task is proposed to wipe the context for each character from the input image, utilizing a character mask learned in a weak supervision manner. Second, the framework adopts an Individual Character Learning task, which is a single-character classification task with synthetic samples. After training on English and simplified Chinese data, our framework can adapt to recognize unseen characters in Japanese, Korean, Greek, and other scripts without retraining, and can reliably spot unseen characters in Japanese with an F1-score over 64%. The framework also shows 91.5% line accuracy on IIIT5k and a speed of over 69 FPS single-batched, making it a feasible universal lightweight OCR solution that works well for both open-set and close-set use cases.

Place, publisher, year, edition, pages
IEEE, 2024. Vol. 33, p. 6497-6507
Keywords [en]
Zero-shot learning, anomaly detection, text recognition
National Category
Natural Language Processing Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-110794DOI: 10.1109/tip.2024.3480711ISI: 001358281600003PubMedID: 39527432Scopus ID: 2-s2.0-85209902434OAI: oai:DiVA.org:ltu-110794DiVA, id: diva2:1915650
Note

Validerad;2024;Nivå 2;2024-11-25 (sarsun);

Funder: National Science Fund for Distinguished Young Scholars (62125601); National Natural Science Foundation of China (62006018); National Natural Science Foundation of China (62076024)

Available from: 2024-11-25 Created: 2024-11-25 Last updated: 2025-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Liu, Chang

Search in DiVA

By author/editor
Liu, ChangYang, ChunFang, ZhiyuYin, Xu-Cheng
By organisation
Embedded Internet Systems Lab
In the same journal
IEEE Transactions on Image Processing
Natural Language ProcessingComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 115 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf