RAS MathematicsПрограммирование Programming and Computer Software

  • ISSN (Print) 0132-3474
  • ISSN (Online) 3034-5847

ANALYZING THE INFLUENCE OF HYPERPARAMETERS ON THE EFFICIENCY OF OCR MODEL FOR PRE-REFORM HANDWRITTEN TEXTS

PII
S3034584725030071-1
DOI
10.7868/S3034584725030071
Publication type
Article
Status
Published
Authors
Volume/ Edition
Volume / Issue number 3
Pages
70-79
Abstract
The article considers the influence of hyperparameters on the efficiency of models of optical handwriting recognition of pre-reform period on the example of handwritten reports of governors of the Yenisei province of the XIX century. A comparative analysis of model configurations with different architectural components, including normalization modules, feature extraction blocks and predictors, is carried out. Particular attention is paid to the role of input image resolution and the size of hidden layers in achieving an optimal balance between prediction accuracy and computational cost. The results obtained allow us to identify key parameters for the development of optical character recognition systems adapted to historical texts with non-standard orthography and complex structure. Prospects for further research include evaluating synthetic methods for extending training data and analyzing alternative architectures such as transformers.
Keywords
оптическое распознавание символов гиперпараметры распознавание рукописного текста дореформенная орфография модули нормализации нейронные сети исторические документы архитектура модели точность оптимизация
Date of publication
02.06.2025
Year of publication
2025
Number of purchasers
0
Views
75

References

  1. 1. Karatzas D., Gomez-Bigorda L., Nicolaou A. et al. ICDAR 2015 Robust Reading Competition // Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2015. DOI: 10.1109/ICDAR.2015.7333942.
  2. 2. Lattner C. LLVM: An Infrastructure for Multi-Stage Optimization. Master’s thesis, Computer Science Dept., University of Illinois at Urbana-Champaign, Urbana, IL.
  3. 3. de Campos T., Babu B., Varma M. Character Recognition in Natural Images // Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP). 2009.
  4. 4. Chammas E., Mokbel C., Likforman-Sulem L. Handwriting Recognition of Historical Documents with Few Labeled Data. Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). 2018.
  5. 5. Mohammed H., Jampour M. From Detection to Modelling: An End-to-End Paleographic System for Analysing Historical Handwriting Styles. Lecture Notes in Computer Science 2024. 14994. P. 363-376.
  6. 6. Галушко И.Н. Корректировка результатов OCR-распознавания текста исторического источника с помощью нечетких множеств (на примере газеты начала XX века) // Историческая информатика. 2023. № 1. https://cyberleninka.ru/article/n/korrektirovka-rezultatov-ocr-raspoznavaniya-teksta-istoricheskogo-istochnika-s-pomoschyu-nechetkih-mnozhestv-na-primere-gazety
  7. 7. Рогов А.А., Скабин А.В., Штеркель И.А. О дешифровке рукописных исторических документов // CEUR Workshop Proceedings. 2012.
  8. 8. Юмашева Ю.Ю. Автоматизированное распознавание рукописных текстов с помощью алгоритмов искусственного интеллекта: российский и зарубежный опыт // Digital Orientalia. 2023. Т. 3. № 1-2. С. 24-32.
  9. 9. Li M., Lv T., Chen J. et al. TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models // arXiv preprint arXiv:2109.10282. 2021. https://arxiv.org/abs/2109.10282
  10. 10. Coquenet D., Chatelain C., Paquet T. End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network // arXiv preprint arXiv:2012.03868. 2020. https://arxiv.org/abs/2012.03868
  11. 11. Baek Y., Lee B., Han D., Yun S., Lee H. Character Region Awareness for Text Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. P. 9365-9374. DOI: 10.1109/CVPR.2019.00960.
  12. 12. Zhou X., Yao C., Wen H., Wang Y., Zhou S., He W., Liang J. EAST: An Efficient and Accurate Scene Text Detector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 5551-5560. DOI: 10.1109/CVPR.2017.587.
  13. 13. Liao M., Wan Z., Yao C., Chen K., Bai X. (2020). Real-time Scene Text Detection with Differentiable Binarization. Proceedings of the AAAI Conference on Artificial Intelligence. V. 34. № 7. P. 11474-11481. DOI: 10.1609/aaai.v34i07.6884.
  14. 14. Lang W., Xie E., Li X., Hou W., Lu T., Yu G., Shao S. Shape Robust Text Detection with Progressive Scale Expansion Network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. P. 9336-9345. DOI: 10.1109/CVPR.2019.00956.
  15. 15. Baek J., Kim G., Lee J., Park S., Han D., Yun S., Oh S.J., Lee H. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019. P. 4715-4723. DOI: 10.1109/ICCV.2019.00485.
  16. 16. Smith R. An Overview of the Tesseract OCR Engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR). 2007. P. 629-633. DOI: 10.1109/ICDAR.2007.4376991.
  17. 17. Brandt Skelbye M., Dannélls D. OCR Processing of Swedish Historical Newspapers Using Deep Hybrid CNN-LSTM Networks. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021). 2021. P. 190-198. https://aclanthology.org/2021.ranlp-1.23/
  18. 18. Wick C., Reul C., Puppe F. Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks. 2018. arXiv preprint arXiv:1802.10033. https://arxiv.org/abs/1802.10033
  19. 19. Lyu L., Koutraki M., Krickl M., Fetahu B. Neural OCR Post-Hoc Correction of Historical Corpora. 2021. arXiv preprint arXiv:2102.00583. https://arxiv.org/abs/2102.00583
  20. 20. Shi B., Wang X., Lyu P., Yao C., Bai X. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). 2018. № 41(9). Р. 2035-2048. DOI: 10.1109/TPAMI.2018.2848938.
  21. 21. Sun Z., Pan W., Luo X. Attention-based Handwritten Text Recognition Using CNN-BiLSTM Architecture. Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 2019.
  22. 22. Luong M.T., Pham H., Manning C.D. Effective Approaches to Attention-based Neural Machine Translation. 2015. arXiv preprint arXiv:1508.04025. https://arxiv.org/abs/1508.04025
  23. 23. FromThePage: Collaborative Transcription and OCR Platform. https://www.fromthepage.com (дата обращения: 15.01.2025)
  24. 24. Отчеты губернаторов Енисейской губернии // FromThePage. https://fromthepage.sfu-kras.ru/lib/otchyoty-gubernatorov-eniseyskoy-gubernii (дата обращения: 15.01.2025)
  25. 25. Кожин К. Программа для разметки изображений под задачи оптического распознавания символов (Anno OCR): Свид. о регистр. ПрЭВМ № 2024684369. Российская Федерация, 2024.
  26. 26. Mann H.B., Whitney D.R. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other // Annals of Mathematical Statistics. 1947. V. 18. № 1. P. 50-60.
  27. 27. Zhu X. Sample size calculation for Mann-Whitney U test with five methods // International Journal of Clinical Trials. 2021. V. 8. № 3. P. 184-190.
  28. 28. Mokeyev A., Artemova E., Malkin P. StackMix and Blot Augmentations for Handwritten Recognition using CTCLoss. arXiv preprint arXiv:2108.11667. 2021. https://arxiv.org/abs/2108.11667
  29. 29. Fogel S., Averbuch-Elor H., Cohen S., Mazor S., Litman R. ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. P. 4324-4333. DOI: 10.1109/CVPR42600.2020.00430.
QR
Translate

Индексирование

Scopus

Scopus

Scopus

Crossref

Scopus

Higher Attestation Commission

At the Ministry of Education and Science of the Russian Federation

Scopus

Scientific Electronic Library