Ijtimoiy-gumanitar fanlar

O‘ZBEK TILI SHEVALARINI AVTOMATIK NUTQNI ANIQLASH (ASR) TIZIMLARIDA TANIB OLISH SAMARADORLIGI TAHLILI

Automatic speech recognition, ASR, Speech-to-Text, Uzbek language dialects, Word Error Rate (WER), Whisper Large, Kotib STT, dialectal corpus, fine-tuning

Authors

  • Xurmatoy MULLABOYEVA Magistrant, Toshkent davlat o‘zbek tili va adabiyoti universiteti, Toshkent, O‘zbekiston, Uzbekistan

This article comparatively investigates the impact of Uzbek dialectal diversity on the performance of automatic speech recognition (ASR) systems. The relevance of the study lies in the performance degradation of intelligent models during live dialectal speech processing. Through a statistical analysis of audio samples from the Karluk, Kipchak, and Oghuz dialect groups, the capabilities of Kotib STT, OmoN STT, Rubai STT, and Whisper Large models were evaluated. Results indicate that while recognition accuracy for the standard literary language reaches 90–95%, it drops to 25–30% for highly variable dialects, primarily due to lexical errors caused by affixal reduction and vowel alternation. To mitigate this issue, the study scientifically justifies the need to expand multi-dialectal acoustic-linguistic corpora and fine-tune models based on regional linguistic characteristics.