CFOR: Character-First Open-Set Text Recognition via Context-Free LearningShow others and affiliations
2024 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 33, p. 6497-6507Article in journal (Refereed) Published
Abstract [en]
The open-set text recognition task is a generalized form of the (close-set) text recognition task, where the model is further challenged to spot and incrementally recognize novel characters not covered by the training data. Novel characters also indicate that the language model of the training set is biased from the “real-world”. In this work, we alleviate the confounding effect of such biases by learning from individual character representations isolated from their context. Specifically, we propose a Character-First Open-Set Text Recognition framework that cotrains the feature extractor with two context-free learning tasks. First, a Context Isolation Learning task is proposed to wipe the context for each character from the input image, utilizing a character mask learned in a weak supervision manner. Second, the framework adopts an Individual Character Learning task, which is a single-character classification task with synthetic samples. After training on English and simplified Chinese data, our framework can adapt to recognize unseen characters in Japanese, Korean, Greek, and other scripts without retraining, and can reliably spot unseen characters in Japanese with an F1-score over 64%. The framework also shows 91.5% line accuracy on IIIT5k and a speed of over 69 FPS single-batched, making it a feasible universal lightweight OCR solution that works well for both open-set and close-set use cases.
Place, publisher, year, edition, pages
IEEE, 2024. Vol. 33, p. 6497-6507
Keywords [en]
Zero-shot learning, anomaly detection, text recognition
National Category
Natural Language Processing Computer Sciences
Research subject
Machine Learning
Identifiers
URN: urn:nbn:se:ltu:diva-110794DOI: 10.1109/tip.2024.3480711ISI: 001358281600003PubMedID: 39527432Scopus ID: 2-s2.0-85209902434OAI: oai:DiVA.org:ltu-110794DiVA, id: diva2:1915650
Note
Validerad;2024;Nivå 2;2024-11-25 (sarsun);
Funder: National Science Fund for Distinguished Young Scholars (62125601); National Natural Science Foundation of China (62006018); National Natural Science Foundation of China (62076024)
2024-11-252024-11-252025-10-21Bibliographically approved