ATOQLI OTLARNI ANIQLASHNING ANNOTATSIYA QOIDALARI VA MATEMATIK MODELLARI
This article describes annotation rules for identifying Named Entities in texts, the BIO tagging
system, mathematical models (CRF, BiLSTM-CRF, Transformer), features specific to agglutinative
languages, and practical examples using real Uzbek texts. The formal expression of model
construction, a probability-based approach, agreement between annotators (Cohen’s Kappa), and
methods for improving annotation quality are also covered. The article serves as a methodological
guide for researchers who want to create a NER system in the field of natural language processing
(NLP).
1. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural Architectures for
Named Entity Recognition. NAACL-HLT. (BiLSTM-CRF asosidagi mashhur NER modeli)
2. Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF Models for Sequence Tagging.
arXiv:1508.01991. (NER va boshqa tegishli masalalar uchun LSTM-CRFning klassik varianti)
3. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding. NAACL-HLT. (Transformer asosida kontekstli model –
zamonaviy NERning asosi)
4. Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing. Prentice Hall. (NLP bo‘yicha eng
mashhur darslik, NER bo‘limi mavjud)
5. Rajabov J.Sh., Formalizing the Uzbek Language: A Comprehensive Exploration Using Backus-Naur
Forms, Acta NUUz, vol. 1(1), 2023 (01.00.00. - 8).
Copyright (c) 2025 «ACTA NUUz»

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


.jpg)

1.png)






