• Title/Summary/Keyword: Urdu speakers

Search Result 2, Processing Time 0.019 seconds

Foreign Accents Classification of English and Urdu Languages, Design of Related Voice Data Base and A Proposed MLP based Speaker Verification System

  • Muhammad Ismail;Shahzad Ahmed Memon;Lachhman Das Dhomeja;Shahid Munir Shah
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.10
    • /
    • pp.43-52
    • /
    • 2024
  • A medium scale Urdu speakers' and English speakers' database with multiple accents and dialects has been developed to use in Urdu Speaker Verification Systems, English Speaker Verification Systems, accents and dialect verification systems. Urdu is the national language of Pakistan and English is the official language. Majority of the people are non-native Urdu speakers and non-native English in all regions of Pakistan in general and Gilgit-Baltistan region in particular. In order to design Urdu and English speaker verification systems for security applications in general and telephone banking in particular, two databases has been designed one for foreign accent of Urdu and another for foreign accent of English language. For the design of databases, voice data is collected from 180 speakers from GB region of Pakistan who could speak Urdu as well as English. The speakers include both genders (males and females) with different age groups ranging from 18 to 69 years. Finally, using a subset of the data, Multilayer Perceptron based speaker verification system has been designed. The designed system achieved overall accuracy rate of 83.4091% for English dataset and 80.0454% for Urdu dataset. It shows slight differences (4.0% with English and 7.4% with Urdu) in recognition accuracy if compared with the recently proposed multilayer perceptron (MLP) based SIS achieved 87.5% recognition accuracy

Building a text collection for Urdu information retrieval

  • Rasheed, Imran;Banka, Haider;Khan, Hamaid M.
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.856-868
    • /
    • 2021
  • Urdu is a widely spoken language in the Indian subcontinent with over 300 million speakers worldwide. However, linguistic advancements in Urdu are rare compared to those in other European and Asian languages. Therefore, by following Text Retrieval Conference standards, we attempted to construct an extensive text collection of 85 304 documents from diverse categories covering over 52 topics with relevance judgment sets at 100 pool depth. We also present several applications to demonstrate the effectiveness of our collection. Although this collection is primarily intended for text retrieval, it can also be used for named entity recognition, text summarization, and other linguistic applications with suitable modifications. Ours is the most extensive existing collection for the Urdu language, and it will be freely available for future research and academic education.