• Title/Summary/Keyword: Tagalog

Search Result 3, Processing Time 0.018 seconds

Implementation and Evaluation of an HMM-Based Speech Synthesis System for the Tagalog Language

  • Mesa, Quennie Joy;Kim, Kyung-Tae;Kim, Jong-Jin
    • MALSORI
    • /
    • v.68
    • /
    • pp.49-63
    • /
    • 2008
  • This paper describes the development and assessment of a hidden Markov model (HMM) based Tagalog speech synthesis system, where Tagalog is the most widely spoken indigenous language of the Philippines. Several aspects of the design process are discussed here. In order to build the synthesizer a speech database is recorded and phonetically segmented. The constructed speech corpus contains approximately 89 minutes of Tagalog speech organized in 596 spoken utterances. Furthermore, contextual information is determined. The quality of the synthesized speech is assessed by subjective tests employing 25 native Tagalog speakers as respondents. Experimental results show that the new system is able to obtain a 3.29 MOS which indicates that the developed system is able to produce highly intelligible neutral Tagalog speech with stable quality even when a small amount of speech data is used for HMM training.

  • PDF

AutoCor: A Query Based Automatic Acquisition of Corpora of Closely-related Languages

  • Dimalen, Davis Muhajereen D.;Roxas, Rachel Edita O.
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.146-154
    • /
    • 2007
  • AutoCor is a method for the automatic acquisition and classification of corpora of documents in closely-related languages. It is an extension and enhancement of CorpusBuilder, a system that automatically builds specific minority language corpora from a closed corpus, since some Tagalog documents retrieved by CorpusBuilder are actually documents in other closely-related Philippine languages. AutoCor used the query generation method odds ratio, and introduced the concept of common word pruning to differentiate between documents of closely-related Philippine languages and Tagalog. The performance of the system using with and without pruning are compared, and common word pruning was found to improve the precision of the system.

  • PDF

A Study on the Vowel System Universals of Southeast Asian Languages: The Cases of Tagalog, Malay and Thai. (동남아시아 언어의 모음체계 보편성 연구 - 타갈로그어, 말레이어, 타이어를 대상으로 -)

  • Heo, Yong
    • Cross-Cultural Studies
    • /
    • v.48
    • /
    • pp.391-417
    • /
    • 2017
  • Southeast Asian languages are famous for having a large number of vowel sounds with an average of more than 20 vowel sounds in this certain language family. In addition, there are approximately 1,500 languages in this area, which accounts for approximately 20% of total languages in the world. For this reason, vowel systems of Southeast Asian languages should be explored to determine the nature of vowel structures of human natural languages. In this study, we analyze vowel systems of three languages, Tagalog, Malay and Thai, that have only primary or normal vowels and thus are relatively simple structures based on descriptive and analytic universals. We would also like to confirm if the six criteria of the tentative evaluation model taken from several previous literature is appropriate in applying analysis of vowel system universals under the method of the Greenbergian Universals or statistic universals. What we have found from this research are (i) the three languages have high level of universals with some exceptional cases such as three-vowel system of Tagalog, and (ii) some of the six criteria, together with some cases of analytic universals, are not quite suitable for understanding language-specific universals that are different from other languages.