• Title/Summary/Keyword: Allophone

Search Result 20, Processing Time 0.025 seconds

Isolated Word Recognition Using Allophone Unit Hidden Markov Model (변이음 HMM을 이용한 고립단어 인식)

  • Lee, Gang-Sung;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.2
    • /
    • pp.29-35
    • /
    • 1991
  • In this paper, we discuss the method of recognizing allophone unit isolated words using hidden Markov model(HMM). Frist we constructed allophone lexicon by extracting allophones from training data and by training allophone HMMs. And then to recognize isolated words using allophone HMMs, it is necessary to construct word dictionary which contains information of allophone sequence and inter-allophone transition probability. Allophone sequences are represented by allophone HMMs. To see the effects of inter-allophone transition probability and to determine optimal probabilities, we performend some experiments. And we showed that small number of traing data and simple train procedure is needed to train word HMMs of allophone sequences and that not less performance than word unit HMM is obtained.

  • PDF

A Study on Korean Allophone Recognition Using Hierarchical Time-Delay Neural Network (계층구조 시간지연 신경망을 이용한 한국어 변이음 인식에 관한 연구)

  • 김수일;임해창
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.1
    • /
    • pp.171-179
    • /
    • 1995
  • In many continuous speech recognition systems, phoneme is used as a basic recognition unit However, the coarticulation generated among neighboring phonemes makes difficult to recognize phonemes consistently. This paper proposes allophone as an alternative recognition unit. We have classified each phoneme into three different allophone groups by the location of phoneme within a syllable. For a recognition algorithm, time-delay neural network(TDNN) has been designed. To recognize all Korean allophones, TDNNs are constructed in modular fashion according to acoustic-phonetic features (e.g. voiced/unvoiced, the location of phoneme within a word). Each TDNN is trained independently, and then they are integrated hierarchically into a whole speech recognition system. In this study, we have experimented Korean plosives with phoneme-based recognition system and allophone-based recognition system. Experimental results show that allophone-based recognition is much less affected by the coarticulation.

  • PDF

Acoustic Model Improvement and Performance Evaluation of the Variable Vocabulary Speech Recognition System (가변 어휘 음성 인식기의 음향모델 개선 및 성능분석)

  • 이승훈;김회린
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.8
    • /
    • pp.3-8
    • /
    • 1999
  • Previous variable vocabulary speech recognition systems with context-independent acoustic modeling, could not represent the effect of neighboring phonemes. To solve this problem, we use allophone-based context-dependent acoustic model. This paper describes the method to improve acoustic model of the system effectively. Acoustic model is improved by using allophone clustering technique that uses entropy as a similarity measure and the optimal allophone model is generated by changing the number of allophones. We evaluate performance of the improved system by using Phonetically Optimized Words(POW) DB and PC commands(PC) DB. As a result, the allophone model composed of six hundreds allophones improved the recognition rate by 13% from the original context independent model m POW test DB.

  • PDF

Information Theoretic Approach to Middle Korean [ß] (정보이론 기반 중세국어 'ㅸ'의 음운론적 대립에 대한 연구)

  • Park, Sunwoo
    • Korean Linguistics
    • /
    • v.79
    • /
    • pp.63-89
    • /
    • 2018
  • This study explores contrastive relation among voiced bilabial fricative [${\ss}$], voiceless bilabial stop [p] and glide [w] in Middle Korean consonant system based on Probabilistic Model. Preceding researches about voiced bilabial fricative [${\ss}$] proposed two influential arguments. One is voiced bilabial fricative [${\ss}$] was an independent phoneme, the other is it was not an independent phoneme but an allophone of voiceless bilabial stop [p] in Middle Korean. This study applies Probabilistic Phonological Relationship Model (PPRM) for solving the problem of dichotomy about contrastive and allophonic relations. The analysis result of the contrastive entropy by PPRM suggests that voiced bilabial fricative [${\ss}$] was just an allophone of voiceless bilabial stop [p] or glide [w] in Middle Korean. Comparing the entropies between [p] and other consonants with the entropies between [${\ss}$] and other consonants, a continuum defined in terms of entropy reveals that [${\ss}$] in Middle Korean was more allophonic than phonemic.

The phonetics and phonology of flapping in Yonbyon dialects (연변어 탄설음화 현상의 음성, 음운론적 분석)

  • Kang Hyunsook
    • MALSORI
    • /
    • no.37
    • /
    • pp.1-12
    • /
    • 1999
  • In this paper, we examine the allophones of an underlying segment /l/ in Korean dialects. In particular, we examine how an underlying /l/ sound surfaces in the Korean dialect spoken at Yonbyon, China. To do so, we employ the following processes: First, we perform the phonetic studies on the allophones of an underlying /l/ in the Yonbyon dialect. Secondly, we compare the phonological environments of the allophones of an underlying /l/ in the Yonbyon dialect with the South Korean dialect. Finally, we discuss the phonological implications of the allophones of the underlying /l/ in terms of Feature Geometry and Syllable Contact Law. Based on the phonetic study, we will argue that the distinctive feature [sonorant] should be placed outside the root node and that the flap, an allophone of an underlying /l/, should be understood as an obstruent, not a sonorant.

  • PDF

A Method of Scaling Time-Delay Neural Networks for Korean Allophone Recognition (한국어 변이음 인식을 위한 시간지연 신경망의 확장방법)

  • 김수일
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.229-234
    • /
    • 1994
  • 본 논문에서는 한국어 변이음을 인식하기 위한 시간지연 신경망의 확장 방법을 살펴보고 한국어 파열음의 벼이음을 인식하는 실험을 통해 각 확장 방법의 인식 성능을 비교한다. 먼저 변이음을 연속음성인식의 인식단위로 사용하기 위하여 한 음소이모든 변이음을 고려하면서 서로 유사한 변이음을 통합 분류하여 3개의 변이음 군으로 나눈다. 한국어 파열음에 대한 인식 실험결과, 음향 음성학적인 특성에 따라 나누어진 trbah 시간지연 신경망들을 모듈 별로 학습한 후, 계층적으로 통합하여 전체적인 시간지연 신경망을 구성하는 방법이 가장 좋은 성능을 나타내었다. 또한, 변이음 단위 인식이 음소 단위 인식에서 문제가 되는 조음 결합 현상을 해결할 수 있음을 확인하였고, 변이음 인식의 결과인 변이음 열이 제공하는 부가적인 정보를 음운파상에 이용하는 방법에 대해 고찰하였다.

  • PDF

Study on Efficient Generation of Dictionary for Korean Vocabulary Recognition (한국어 음성인식을 위한 효율적인 사전 구성에 관한 연구)

  • Lee Sang-Bok;Choi Dae-Lim;Kim Chong-Kyo
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.41-44
    • /
    • 2002
  • This paper is related to the enhancement of speech recognition rate using enhanced pronunciation dictionary. Modern large vocabulary, continuous speech recognition systems have pronunciation dictionaries. A pronunciation dictionary provides pronunciation information for each word in the vocabulary in phonemic units, which are modeled in detail by the acoustic models. But in most speech recognition system based on Hidden Markov Model, actual pronunciation variations are disregarded. Without the pronunciation variations in the speech recognition system, the phonetic transcriptions in the dictionary do not match the actual occurrences in the database. In this paper, we proposed the unvoiced rule of semivowel in allophone rules to pronunciation dictionary. Experimental results on speech recognition system give higher performance than existing pronunciation dictionaries.

  • PDF

HMnet Evaluation for Phonetic Environment Variations of Traning Data in Speech Recognition

  • Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4E
    • /
    • pp.28-36
    • /
    • 1996
  • In this paper, we propose a new evaluation methodology which can more clearly show the performance of the allophone modeling algorithm generally used in large vocabulary speech recognition. The proposed evaluation method shows the running characteristics and limitations of the modeling algorithm by testing how the variation of phonetic environments of training data affects the recognition performance and the desirable number of free parameters to be estimated. Using the method, we experiment results, we conclude that, in vocabulary-independent recognition task, the phonetic diversity of training data greatly affects the robustness of model, and it is necessary to develop a proper measure which can determine the number of states compromizing the robustness and the precision of the HMnet better than the conventional modeling efficiency.

  • PDF

Ortho-phonic Alphabet Creation by the Musical Theory and its Segmental Algorithm (악리론으로 본 정음창제와 정음소 분절 알고리즘)

  • Chin, Yong-Ohk;Ahn, Cheong-Keung
    • Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.49-59
    • /
    • 2001
  • The phoneme segmentation is a very difficult problem in speech sound processing because it has found out segmental algorithm in many kinds of allophone and coarticulation's trees. Thus system configuration for the speech recognition and voice retrieval processing has a complex system structure. To solve it, we discuss a possibility of new segmental algorithm, which is called the minus a thirds one or plus in tripartitioning(삼분손익) of twelve temporament(12 율려), first proposed by Prof. T. S. Han. It is close to oriental and western musical theory. He also has suggested a 3 consonant and 3 vowel phonemes in Hunminjungum(훈민정음) invented by the King Sejong in the 15th century. In this paper, we suggest to newly name it as ortho-phonic phoneme(OPP/정음소), which carries the meaning of 'the absoluteness and independency'. OPP also is acceptable to any other languages, for example IPA. Lastly we know that this algorithm is constantly applicable to the global language and is very useful to construct a voice recognition and retrieval structuring engineering.

  • PDF