A HIERARCHICALLY DEPENDENCY LINKED SYNTACTIC TREEBANK FOR UZBEK LITERARY TEXTS
This paper examines the establishment of a syntactic hierarchical treebank derived from Uzbek literary works, with the objective of advancing this corpus. Thirty high-quality sentences were extracted from "Boljon," a short tale anthology by modern Uzbek author Shuhrat Matkarim, serving as corpus material. These sentences were manually annotated by two annotators using the INCEpTION platform, including lemmatization, morphological tagging, and syntactic dependency annotation. As a result, the first syntactic dependency tree corpus for Uzbek narrative style was created and syntactic features within the corpus were analyzed.
Rabbimov, I. M., & Kobilov, S. S. (2020). Multi-class text classification of Uzbek news articles using machine learning. Journal of Physics: Conference Series, 1546(1), 012097.
2. Akhundjanova, A., & Talamo, L. (2025). O‘zbek tili uchun Universal Dependencies daraxt korpusi. RESOURCEFUL-2025: Qo‘llab-quvvatlanishi past tillar va domenlar uchun resurslar bo‘yicha 3-Workshop materiallari. ACL.
3. Matlatipov, G., & Vetulani, Z. (2009). Prolog yordamida o‘zbek tili morfologik tahlili. In Proceedings of the 7th International Conference on Formal Approaches to South Slavic and Turkic Languages.
4. Sharipov, M., Mattiyev, J., Sobirov, J., & Baltayev, R. (2022). O‘zbek tilining morfologik va sintaktik teglangan korpusi-ni yaratish. ALTNLP 2022: Agglutinative Language Technologies as a Challenge of NLP (virtual konferensiya, Koper, Sloveniya), CEUR-WS, vol. 3315, sah. 93–98
O‘zMU xabarlari Вестник НУУз ACTA NUUz FILOLOGIYA 1/5 2025
- 258 -
5. Sharipov, M., & Yuldashev, O. (2022). UzbekStemmer: O‘zbek tili uchun qoidaviy stemming algoritmi. ALTNLP 2022 konferensiyasi materiallari, CEUR-WS, vol. 3315, sah. 137–144
6. Salaev, U. (2023). UzMorphAnalyzer: O‘zbek tili uchun morfologik tahlil modeli (so‘zlardagi qo‘shimchalarga asoslangan). Science and Innovation, 2(1), 29–34.
7. Shuhrat Matkarim, Boljon, Toshkent, 2022-y, 318
8. Agostini, A., Usmanov, T., Khamdamov, U., Abdurakhmonova, N., & Mamasaidov, M. (2021). UZWORDNET: O‘zbek tili uchun leksik-semantik ma’lumotlar bazasi. 11-Global WordNet konferensiyasi materiallari, sah. 8–19, Janubiy Afrika (UNISA).
9. de Marneffe, M.-C., Manning, C. D., Nivre, J., & Zeman, D. (2021). Universal Dependencies: ko‘p tilli sintaktik annotatsiya tizimi. Computational Linguistics, 47(2), 255–308.
10. Kuriyozov, E., Matlatipov, S., Alonso, M. A., & Gómez-Rodríguez, C. (2019). “Construction and evaluation of sentiment datasets for low-resource languages: The case of Uzbek.” In Human Language Technologies as a Challenge for Computer Science and Linguistics: 9th Language and Technology Conference (LTC 2019), Poznan, Poland. Revised Selected Papers, pp. 232–243. Berlin: Springer.
11. Rahmatullayev, Sh. (2006). Hozirgi adabiy o‘zbek tili (darslik). Toshkent: Universitet nashri.
12. Sharipov, M., Kuriyozov, E., Yuldashev, O., & Sobirov, O. (2023). UzbekTagger: O‘zbek tili uchun qoidaviy so‘z turkumlari teglagichi. arXiv preprint arXiv:2301.12711.
13. Universal Dependencies Treebank for Uzbek
14. The INCEpTION Platform: Machine-Assisted and Knowledge-Oriented Interactive Annotation - ACL Anthology
Copyright (c) 2025 «ACTA NUUz»

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.






.jpg)

1.png)





