O‘ZBEK TILI UCHUN UNIVERSAL BOG‘LIQLIK DARAXTI KORPUSI ASOSIDA CHUQUR BI-AFFIN TOBELIK TAHLILINING NEYRON MODELI
This article introduces a new Universal Dependencies (UD) treebank for the Uzbek language and a
dependency parser based on a deep biaffine neural attention mechanism. The corpus contains 686
sentences ( 7,800 tokens) from literary and popular-science texts, manually annotated with lemmas,
POS tags, morphological features and dependency relations, achieving inter-annotator agreement
above 95% for lemmatization and UPOS. On top of this gold-standard resource, we train and evaluate
a BiLSTM-based deep biaffine dependency parser implemented in the Stanza pipeline, obtaining
86.10% UPOS accuracy, 70.06% UFeats accuracy and, under gold morphology, 69.21% UAS and
53.21% LAS on the test set. The treebank and model define the first strong neural baseline for
dependency parsing in Uzbek and provide a mathematically grounded platform for further NLP
research on the language.
1. John Carroll. 2010. Book Review: Dependency Parsing by Sandra Kubler, Ryan McDonald, and Joakim
Nivre. Computational Linguistics, 36(1).
2. Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajic, Christopher D. Manning, Sampo
Pyysalo, Sebastian Schuster, Francis Tyers, and Daniel Zeman. 2020. Universal Dependencies v2: An
Evergrowing Multilingual Treebank Collection. In Proceedings of the Twelfth Language Resources and
Evaluation Conference, pages 4034-4043, Marseille, France. European Language Resources Association.
3. Dozat, T., & Manning, C. D. (2017). Deep Biaffine Attention for Neural Dependency Parsing. ICLR 2017.
4. Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. 2020. Stanza: A Python
Natural Language Processing Toolkit for Many Human Languages. In Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101-108, Online.
Association for Computational Linguistics.
5. Matlatipov, S. G., et al. (2024). UzUDT: Universal Dependencies Treebank for Uzbek. National University
of Uzbekistan.
6. McEnery T, Hardie A. Corpus Linguistics: Method, Theory and Practice. Cambridge University Press;
2011
Copyright (c) 2025 «ACTA NUUz»

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


.jpg)

1.png)






