ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS

Sherstnev, P. A.; Kozhin, K. D.; Pyataeva, A. V.

doi:10.7868/S3034584725030071

PII

S3034584725030071-1

DOI

10.7868/S3034584725030071

Publication type

Article

Status

Published

Authors

P. A. Sherstnev

Affiliation: Artificial Intelligence Center of Siberian Federal University

K. D. Kozhin

Affiliation: Artificial Intelligence Center of Siberian Federal University

A. V. Pyataeva

Affiliation: Artificial Intelligence Center of Siberian Federal University

Volume/ Edition

Volume / Issue number 3

Pages

70-79

Abstract

The article considers the influence of hyperparameters on the efficiency of models of optical handwriting recognition of pre-reform period on the example of handwritten reports of governors of the Yenisei province of the XIX century. A comparative analysis of model configurations with different architectural components, including normalization modules, feature extraction blocks and predictors, is carried out. Particular attention is paid to the role of input image resolution and the size of hidden layers in achieving an optimal balance between prediction accuracy and computational cost. The results obtained allow us to identify key parameters for the development of optical character recognition systems adapted to historical texts with non-standard orthography and complex structure. Prospects for further research include evaluating synthetic methods for extending training data and analyzing alternative architectures such as transformers.

Keywords

оптическое распознавание символов гиперпараметры распознавание рукописного текста дореформенная орфография модули нормализации нейронные сети исторические документы архитектура модели точность оптимизация

Date of publication

02.06.2025

Year of publication

2025

Number of purchasers

Views

190

References

1. Karatzas D., Gomez-Bigorda L., Nicolaou A. et al. ICDAR 2015 Robust Reading Competition // Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2015. DOI: 10.1109/ICDAR.2015.7333942.
2. Lattner C. LLVM: An Infrastructure for Multi-Stage Optimization. Master’s thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Urbana, IL.
3. de Campos T., Babu B., Varma M. Character Recognition in Natural Images // Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP). 2009.
4. Chammas E., Mokbel C., Likforman-Sulem L. Handwriting Recognition of Historical Documents with Few Labeled Data. Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). 2018.
5. Mohammed H., Jampour M. From Detection to Modelling: An End-to-End Paleographic System for Analysing Historical Handwriting Styles. Lecture Notes in Computer Science 2024. 14994. P. 363-376.
6. Галушко И.Н. Корректировка результатов OCR-распознавания текста исторического источника с помощью нечетких множеств (на примере газеты начала XX века) // Историческая информатика. 2023. № 1. https://cyberleninka.ru/article/n/korrektirovka-rezultatov-ocr-raspoznavaniya-teksta-istoricheskogo-istochnika-s-pomoschyu-nechetkih-mnozhestv-na-primere-gazety
7. Рогов А.А., Скабин А.В., Штеркель И.А. О дешифровке рукописных исторических документов // CEUR Workshop Proceedings. 2012.
8. Юмашева Ю.Ю. Автоматизированное распознавание рукописных текстов с помощью алгоритмов искусственного интеллекта: российский и зарубежный опыт // Digital Orientalia. 2023. Т. 3. № 1-2. С. 24-32.
9. Li M., Lv T., Chen J. et al. TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models // arXiv preprint arXiv:2109.10282. 2021. https://arxiv.org/abs/2109.10282
10. Coquenet D., Chatelain C., Paquet T. End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network // arXiv preprint arXiv:2012.03868. 2020. https://arxiv.org/abs/2012.03868
11. Baek Y., Lee B., Han D., Yun S., Lee H. Character Region Awareness for Text Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. P. 9365-9374. DOI: 10.1109/CVPR.2019.00960.
12. Zhou X., Yao C., Wen H., Wang Y., Zhou S., He W., Liang J. EAST: An Efficient and Accurate Scene Text Detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 5551-5560. DOI: 10.1109/CVPR.2017.587.
13. Liao M., Wan Z., Yao C., Chen K., Bai X. (2020). Real-time Scene Text Detection with Differentiable Binarization. Proceedings of the AAAI Conference on Artificial Intelligence. V. 34. № 7. P. 11474-11481. DOI: 10.1609/aaai.v34i07.6884.
14. Lang W., Xie E., Li X., Hou W., Lu T., Yu G., Shao S. Shape Robust Text Detection with Progressive Scale Expansion Network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. P. 9336-9345. DOI: 10.1109/CVPR.2019.00956.
15. Baek J., Kim G., Lee J., Park S., Han D., Yun S., Oh S.J., Lee H. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019. P. 4715-4723. DOI: 10.1109/ICCV.2019.00485.
16. Smith R. An Overview of the Tesseract OCR Engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR). 2007. P. 629-633. DOI: 10.1109/ICDAR.2007.4376991.
17. Brandt Skelbye M., Dannélls D. OCR Processing of Swedish Historical Newspapers Using Deep Hybrid CNN-LSTM Networks. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021). 2021. P. 190-198. https://aclanthology.org/2021.ranlp-1.23/
18. Wick C., Reul C., Puppe F. Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks. 2018. arXiv preprint arXiv:1802.10033. https://arxiv.org/abs/1802.10033
19. Lyu L., Koutraki M., Krickl M., Fetahu B. Neural OCR Post-Hoc Correction of Historical Corpora. 2021. arXiv preprint arXiv:2102.00583. https://arxiv.org/abs/2102.00583
20. Shi B., Wang X., Lyu P., Yao C., Bai X. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). 2018. № 41(9). Р. 2035-2048. DOI: 10.1109/TPAMI.2018.2848938.
21. Sun Z., Pan W., Luo X. Attention-based Handwritten Text Recognition Using CNN-BiLSTM Architecture. Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 2019.
22. Luong M.T., Pham H., Manning C.D. Effective Approaches to Attention-based Neural Machine Translation. 2015. arXiv preprint arXiv:1508.04025. https://arxiv.org/abs/1508.04025
23. FromThePage: Collaborative Transcription and OCR Platform. https://www.fromthepage.com (дата обращения: 15.01.2025)
24. Отчеты губернаторов Енисейской губернии // FromThePage. https://fromthepage.sfu-kras.ru/lib/otchyoty-gubernatorov-eniseyskoy-gubernii (дата обращения: 15.01.2025)
25. Кожин К. Программа для разметки изображений под задачи оптического распознавания символов (Anno OCR): Свид. о регистр. ПрЭВМ № 2024684369. Российская Федерация, 2024.
26. Mann H.B., Whitney D.R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other // Annals of Mathematical Statistics. 1947. V. 18. № 1. P. 50-60.
27. Zhu X. Sample size calculation for Mann-Whitney U test with five methods // International Journal of Clinical Trials. 2021. V. 8. № 3. P. 184-190.
28. Mokeyev A., Artemova E., Malkin P. StackMix and Blot Augmentations for Handwritten Recognition using CTCLoss. arXiv preprint arXiv:2108.11667. 2021. https://arxiv.org/abs/2108.11667
29. Fogel S., Averbuch-Elor H., Cohen S., Mazor S., Litman R. ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. P. 4324-4333. DOI: 10.1109/CVPR42600.2020.00430.

GOST	Kozhin K., Pyataeva A., Sherstnev P. ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS // Programming and Computer Software. – 2025. – Issue number 3 C. 70-79 . URL: https://progrras.ru/s3034584725030071-1/?version_id=109051. DOI: 10.7868/S3034584725030071
MLA	Kozhin, K. D, Pyataeva, A. V, Sherstnev, P. A "ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS." Programming and Computer Software. 3 (2025).:70-79. DOI: 10.7868/S3034584725030071
APA	Kozhin K., Pyataeva A., Sherstnev P. (2025). ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS. Programming and Computer Software. no. 3, pp.70-79 DOI: 10.7868/S3034584725030071

RAS MathematicsПрограммирование Programming and Computer Software

ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS

You can

References

Indexing

RAS MathematicsПрограммирование Programming and Computer Software

ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS

You can

References

Indexing

Via social network