Aniq fanlar

ATOQLI OTLARNI ANIQLASHNING ANNOTATSIYA QOIDALARI VA MATEMATIK MODELLARI

token, indexing, agglutinative, annotation, object.

Authors

  • Bobur Allaberdiyev Mirzo Ulug‘bek nomidagi O‘zbekiston milliy universiteti, Toshkent, O‘zbekiston O‘zbekiston xalqaro islomshunoslik akademiyasi, Uzbekistan
  • San’atbek Matlatipov Mirzo Ulug‘bek nomidagi O‘zbekiston milliy universiteti, Toshkent, O‘zbekiston, Uzbekistan
  • Mujgonabonu Mavlonova Mirzo Ulug‘bek nomidagi O‘zbekiston milliy universiteti, Toshkent, O‘zbekiston, Uzbekistan

This article describes annotation rules for identifying Named Entities in texts, the BIO tagging
system, mathematical models (CRF, BiLSTM-CRF, Transformer), features specific to agglutinative
languages, and practical examples using real Uzbek texts. The formal expression of model
construction, a probability-based approach, agreement between annotators (Cohen’s Kappa), and
methods for improving annotation quality are also covered. The article serves as a methodological
guide for researchers who want to create a NER system in the field of natural language processing
(NLP).