A HIERARCHICALLY DEPENDENCY LINKED SYNTACTIC TREEBANK FOR UZBEK LITERARY TEXTS

Uzbek language, literary texts, syntactic connection, tree corpus, lemmatization, morphological marking, Universal Dependencies, INCEpTION platform, annotation

Authors

This paper examines the establishment of a syntactic hierarchical treebank derived from Uzbek literary works, with the objective of advancing this corpus. Thirty high-quality sentences were extracted from "Boljon," a short tale anthology by modern Uzbek author Shuhrat Matkarim, serving as corpus material. These sentences were manually annotated by two annotators using the INCEpTION platform, including lemmatization, morphological tagging, and syntactic dependency annotation. As a result, the first syntactic dependency tree corpus for Uzbek narrative style was created and syntactic features within the corpus were analyzed.