• Title/Summary/Keyword: Segmental features

Search Result 71, Processing Time 0.017 seconds

English Phoneme Recognition using Segmental-Feature HMM (분절 특징 HMM을 이용한 영어 음소 인식)

  • Yun, Young-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.167-179
    • /
    • 2002
  • In this paper, we propose a new acoustic model for characterizing segmental features and an algorithm based upon a general framework of hidden Markov models (HMMs) in order to compensate the weakness of HMM assumptions. The segmental features are represented as a trajectory of observed vector sequences by a polynomial regression function because the single frame feature cannot represent the temporal dynamics of speech signals effectively. To apply the segmental features to pattern classification, we adopted segmental HMM(SHMM) which is known as the effective method to represent the trend of speech signals. SHMM separates observation probability of the given state into extra- and intra-segmental variations that show the long-term and short-term variabilities, respectively. To consider the segmental characteristics in acoustic model, we present segmental-feature HMM(SFHMM) by modifying the SHMM. The SFHMM therefore represents the external- and internal-variation as the observation probability of the trajectory in a given state and trajectory estimation error for the given segment, respectively. We conducted several experiments on the TIMIT database to establish the effectiveness of the proposed method and the characteristics of the segmental features. From the experimental results, we conclude that the proposed method is valuable, if its number of parameters is greater than that of conventional HMM, in the flexible and informative feature representation and the performance improvement.

Combination Tandem Architecture with Segmental Features for Robust Speech Recognition (강인한 음성 인식을 위한 탠덤 구조와 분절 특징의 결합)

  • Yun, Young-Sun;Lee, Yun-Keun
    • MALSORI
    • /
    • no.62
    • /
    • pp.113-131
    • /
    • 2007
  • It is reported that the segmental feature based recognition system shows better results than conventional feature based system in the previous studies. On the other hand, the various studies of combining neural network and hidden Markov models within a single system are done with expectations that it may potentially combine the advantages of both systems. With the influence of these studies, tandem approach was presented to use neural network as the classifier and hidden Markov models as the decoder. In this paper, we applied the trend information of segmental features to tandem architecture and used posterior probabilities, which are the output of neural network, as inputs of recognition system. The experiments are performed on Auroral database to examine the potentiality of the trend feature based tandem architecture. From the results, the proposed system outperforms on very low SNR environments. Consequently, we argue that the trend information on tandem architecture can be additionally used for traditional MFCC features.

  • PDF

Acoustic Analysis for Natural Pronunciation Programs

  • Lim Un
    • MALSORI
    • /
    • no.44
    • /
    • pp.1-14
    • /
    • 2002
  • Because the accuracy and the fluency are the essence in English speaking, both of them are very important in English trencher training and in-service English training programs. To get the accuracy and the fluency, the causes and the phenomena of the unnatural pronunciation have to be diagnosed. Consequently, the problematic and unnatural pronunciation of Korean elementary and secondary English teachers should be analyzed with using Acoustic Analyzing tools like CSL, Multi-speech and Praat. In addition, an attempt to Pinpoint what the causes of unnatural pronunciation was executed. Next a procedure and steps were proposed for in-service training programs that would cultivate the fluency and the accuracy. In case of elementary teachers, the unnatural pronunciation of segmental features and suprasegmental features were found much. therefore segmental features should be emphasized in the begging of pronunciation training courses and then suprasegmental features have to be emphasized. In case of secondary teachers, the unnatural pronunciation of suprasegmental features were found much. Therefore segmental and suprasegmental features have to be focused at the same time. In other words, features in word level should be focused first for elementary English teacher, and features in word level and beyond word level should be trained at the same time for secondary English teachers.

  • PDF

Component dynamics in miscible polymer blends: A review of recent findings

  • Watanabe, Hiroshi;Urakawa, Osamu
    • Korea-Australia Rheology Journal
    • /
    • v.21 no.4
    • /
    • pp.235-244
    • /
    • 2009
  • Miscible polymer blends still have heterogeneity in their component chain concentration in the segmental length scale because of the chain connectivity (that results in the self-concentration of the segments of respective chains) as well as the dynamic fluctuation over various length scales. As a result, the blend components feel different dynamic environments to exhibit different temperature dependence in their segmental relaxation rates. This type of dynamic heterogeneity often results in a broad glass transition (sometimes seen as two separate transitions), a broad distribution of the local (segmental) relaxation modes, and the thermo-rheological complexity of this distribution. Furthermore, the dynamic heterogeneity also affects the global dynamics in the miscible blends if the component chains therein have a large dynamic asymmetry. Thus, the superficially simple miscible blends exhibit interesting dynamic behavior. This article gives a brief summary of the features of the segmental and global dynamics in those blends.

Unilateral segmental odontomaxillary hypoplasia: an unusual case report

  • Pandey, Sushma;Pai, Keerthilatha M.;Nayak, Ajay G.;Vineetha, Ravindranath
    • Imaging Science in Dentistry
    • /
    • v.41 no.1
    • /
    • pp.39-42
    • /
    • 2011
  • Facial asymmetry is not an uncommon occurrence in day to day dental practice. It can be caused by various etiologic factors ranging from facial trauma to serious hereditary conditions. Here, we report a rare case of non-syndromic facial asymmetry in a young female, who was born with this condition but was not aware of the progression of asymmetry. No relevant family history was recognized. She was also deficient in both deciduous and permanent teeth in the corresponding region of maxilla. Hence, the cause of this asymmetry was believed to be a segmental odontomaxillary hypoplasia of left maxilla accompanied by agenesis of left maxillary premolars and molars and disuse atrophy of corresponding facial musculature. This report briefly discussed the comparative features of segmental odontomaxillary hypoplasia, hemimaxillofacial dysplasia, and segmental odontomaxillary dysplasia and justified the differences between segmental odontomaxillary hypoplasia and the other two conditions.

Synthesis and Evaluation of Prosodically Exaggerated Utterances

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.73-85
    • /
    • 2009
  • This paper introduces the technique of synthesizing and evaluating human utterances with exaggerated or atypical prosody. Prosody exaggeration can be implemented by manipulating either the fundamental frequency (F0) contour, the segmental durations, or the intensity contour of an utterance. Of these three prosodic elements, two or more can be exaggerated at the same time. The algorithms of synthesis and evaluation were suggested. Learner utterances exaggerated in each of the three prosodic features were evaluated with respect to their original native versions in terms of the differences in their F0 contours, the segmental durations, and the intensity contours. The measure of differences was the Euclidean distance metric between the matching points in their F0 and intensity contours. The measure was calculated after the exaggerated learner utterances were aligned by the segments and rendered identical to their native version in terms of their segmental durations. For the evaluation of the segmental durations, no prior modifications were made in durations and the same measure was used. The results from the pilot experiment suggest the viability of this measure in the evaluation of learner utterances with atypical prosody with respect to their native versions.

  • PDF

SWAPPING NATIVE AND NON-NATIVE SPEAKERS' PROSODY USING THE PSOLA ALGORITHM

  • Yoon Kyu-Chul
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.77-81
    • /
    • 2006
  • This paper presents a technique of imposing the prosodic features of a native speaker's utterance onto the same sentence uttered by a non-native speaker. Three acoustic aspects of the prosodic features were considered: the fundamental frequency (F0) contour, segmental durations, and the intensity contour. The fundamental frequency contour and the segmental durations of the native speaker's utterance were imposed on the non-native speaker's utterance by using the PSOLA (pitch-synchronous overlap and add) algorithm [1] implemented in Praat[2]. The intensity contour transfer was also done in Praat. The technique of transferring one or more of these prosodic features was elaborated and its implications in the area of language education were discussed.

  • PDF

The Role of Prosody in Dialect Synthesis and Authentication

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.25-31
    • /
    • 2009
  • The purpose of this paper is to examine the viability of synthesizing Masan dialect with Seoul dialect and to examine the role of prosody in the authentication of the synthesized Masan dialect. The synthesis was performed by transferring one or more of the prosodic features of the Masan utterance onto the Seoul utterance. The hypothesis is that, given an utterance composed of the phonemes shared by both dialects, as more prosodic features of the Masan utterance are transferred onto the Seoul utterance, the Seoul utterance will be identified as more authentic Masan utterance. The prosodic features involved were the fundamental frequency contour, the segmental durations, and the intensity contour. The synthesized Masan utterances were evaluated by thirteen native speakers of Masan dialect. The result showed that the fundamental frequency contour and the segmental durations had main effects on the perceptual shift from Seoul to Masan dialect.

  • PDF

Focal Segmental Glomerulosclerosis in a Child with Prader-Willi Syndrome : A Case of Obesity-associated Focal Segmental Glomerulosclerosis

  • Cho Hee-Yeon;Chung Dae-Lim;Kang Ju-Hyung;Ha Il-Soo;Cheong Hae-Il;Choi Yong
    • Childhood Kidney Diseases
    • /
    • v.8 no.2
    • /
    • pp.244-249
    • /
    • 2004
  • Obesity-associated focal segmental glomeruloscleropis(OB-FSGS) has been known to progress into advanced renal insufficiency, and its clinicopathological features Include obesity, FSGS lesions with glornerulomegaly and, nephrotic-range proteinuria without edema. A 14 year old girl with Prader-Willi syndrome showed nephrotic-range proteinuria without hypoalbuminemia or edema. The renal biopsy revealed focal segmental glomerulosclerosis together with glomerular hypertrophy and an increased mesangial matrix. We report here a case of OB-FSGS as one of the renal problems of Pradel-Willi syndrome, and we came to the conclusion that Prader-Willi syndrome is one of the Possible disease entities that can lead to renal insufficiency through obesity.

  • PDF

An Implementation of the Baseline Recognizer Using the Segmental K-means Algorithm for the Noisy Speech Recognition Using the Aurora DB (Aurora DB를 이용한 잡음 음성 인식실험을 위한 Segmental K-means 훈련 방식의 기반인식기의 구현)

  • Kim Hee-Keun;Chung Young-Joo
    • MALSORI
    • /
    • no.57
    • /
    • pp.113-122
    • /
    • 2006
  • Recently, many studies have been done for speech recognition in noisy environments. Particularly, the Aurora DB has been built as the common database for comparing the various feature extraction schemes. However, in general, the recognition models as well as the features have to be modified for effective noisy speech recognition. As the structure of the HTK is very complex, it is not easy to modify, the recognition engine. In this paper, we implemented a baseline recognizer based on the segmental K-means algorithm whose performance is comparable to the HTK in spite of the simplicity in its implementation.

  • PDF