• 제목/요약/키워드: Phonetic Approach

검색결과 78건 처리시간 0.029초

Optimal Decision Tree를 이용한 Unseen Model 추정방법 (Unseen Model Prediction using an Optimal Decision Tree)

  • 김성탁;김회린
    • 대한음성학회지:말소리
    • /
    • 제45호
    • /
    • pp.117-126
    • /
    • 2003
  • Decision tree-based state tying has been proposed in recent years as the most popular approach for clustering the states of context-dependent hidden Markov model-based speech recognition. The aims of state tying is to reduce the number of free parameters and predict state probability distributions of unseen models. But, when doing state tying, the size of a decision tree is very important for word independent recognition. In this paper, we try to construct optimized decision tree based on the average of feature vectors in state pool and the number of seen modes. We observed that the proposed optimal decision tree is effective in predicting the state probability distribution of unseen models.

  • PDF

Teaching English Pronunciation for International Communication

  • Park, Joo-Kyung
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.36-43
    • /
    • 2000
  • Koreans' interest in and concern with learning English are at the peak as more actions and transactions in our daily life are carried out in English. Even though we are experiencing a big transition from a conventional grammar-translation method to communicative language teaching, little efforts have been made to set the new goals and objectives, norms and standards, and to develop new instructional methods for teaching pronunciation for international communication. This lecture will introduce a new approach of teaching English pronunciation for international communication, suggesting how to implement it to Korean ELT classrooms. It will also address the necessity of research on Korean learners of English, focusing on their perception and production of English sounds for international intelligibility and identity,

  • PDF

MMSE Estimator 기반의 적응 콤 필터링을 이용한 잡음 제거 (Noise Reduction Using MMSE Estimator-based Adaptive Comb Filtering)

  • 박정식;오영환
    • 대한음성학회지:말소리
    • /
    • 제60호
    • /
    • pp.181-190
    • /
    • 2006
  • This paper describes a speech enhancement scheme that leads to significant improvements in recognition performance when used in the ASR front-end. The proposed approach is based on adaptive comb filtering and an MMSE-related parameter estimator. While adaptive comb filtering reduces noise components remarkably, it is rarely effective in reducing non-stationary noises. Furthermore, due to the uniformly distributed frequency response of the comb-filter, it can cause serious distortion to clean speech signals. This paper proposes an improved comb-filter that adjusts its spectral magnitude to the original speech, based on the speech absence probability and the gain modification function. In addition, we introduce the modified comb filtering-based speech enhancement scheme for ASR in mobile environments. Evaluation experiments carried out using the Aurora 2 database demonstrate that the proposed method outperforms conventional adaptive comb filtering techniques in both clean and noisy environments.

  • PDF

잡음 차폐를 이용한 온라인 모델 보상 (On-line model compensation using noise masking effect for robust speech recognition)

  • 정규준;조훈영;오영환
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.215-218
    • /
    • 2003
  • In this paper we apply PMC (parallel model combination) to speech recognition system online. As a representative of model based noise compensation techniques, PMC compensates environmental mismatch by combining pretrained clean speech models and real-time estimated noise information. This is very effective approach for compensating extreme environmental mismatch but is inadequate to use in on-line system for heavy computational cost. To reduce the computational cost and to apply PMC online, we use a noise masking effect - the energy in a frequency band is dominated either by clean speech energy or by noise energy - in the process of model compensation. Experiments on artificially produced noisy speech data confirm that the proposed technique is fast and effective for the on-line model compensation.

  • PDF

자동 구두점 삽입을 이용한 Rich Transcription 생성 (Rich Transcription Generation Using Automatic Insertion of Punctuation Marks)

  • 김지환
    • 대한음성학회지:말소리
    • /
    • 제61호
    • /
    • pp.87-100
    • /
    • 2007
  • A punctuation generation system which combines prosodic information with acoustic and language model information is presented. Experiments have been conducted first for the reference text transcriptions. In these experiments, prosodic information was shown to be more useful than language model information. When these information sources are combined, an F-measure of up to 0.7830 was obtained for adding punctuation to a reference transcription. This method of punctuation generation can also be applied to the 1-best output of a speech recogniser. The 1-best output is first time aligned. Based on the time alignment information, prosodic features are generated. As in the approach applied in the punctuation generation for reference transcriptions, the best sequence of punctuation marks for this 1-best output is found using the prosodic feature model and an language model trained on texts which contain punctuation marks.

  • PDF

후기 고대영어의 동질군 장모음화 분석 (An Analysis of Homorganic Cluster Lengthening in Late Old English)

  • 권영국
    • 영어영문학
    • /
    • 제55권4호
    • /
    • pp.719-744
    • /
    • 2009
  • This paper aims to reexamine Homorganic Cluster Lengthening in Late Old English whereby OE short vowels became lengthened before specific consonant clusters such as /-ld, -nd, -mb, -rd, -rð, -ng, -rz/. As for the motivation for this apparently odd-looking sound change, I propose that it was the result of phonologization of the phonetic lengthening of syllables containing resonants homorganic with a following voiced obstruent. Adopting Luick's (1898) view of "resonant+voiced homorganic obstruent" phonologically as a single coda, I show that Homorganic Cluster Lengthening is in fact a natural sound change that can be explained with the proper postulation of a few quantity-related universal constraints within the framework of the Optimality Theory. The fact that the constraints and their ranking as posited in this paper can also account for Pre-Cluster Shortening points to the validity of my approach in the analysis of other quantity changes in Middle English.

연속 음성 인식 시스템을 위한 향상된 결정 트리 기반 상태 공유 (Improved Decision Tree-Based State Tying In Continuous Speech Recognition System)

  • 김동화;;;김형순;김영호
    • 한국음향학회지
    • /
    • 제18권6호
    • /
    • pp.49-56
    • /
    • 1999
  • 결정 트리 기반 상태 공유 방법은 HMM을 사용하는 많은 연속 음성 인식 시스템에서 강인하고 정확한 문맥 종속 음향 모델링 뿐만 아니라 훈련 중에는 나타나지 않은 모델들의 합성을 위하여 널리 사용되고 있다. 음성 결정 트리를 구성하기 위한 표준적인 방법은 단일 가우시안 트라이폰 모델을 이용한 1계층 프루닝 만을 사용하고 있다. 본 논문에서는 더욱 정교한 음향 모델링을 통하여 인식 성능 향상을 도모하기 위하여 새로운 2가지 접근 방법 즉, 2계층 결정 트리와 복수 혼합 결정 트리를 제안한다. 2계층 결정 트리는 상태 공유와 혼합 가중치 공유를 위하여 2계층 프루닝을 수행하며, 두 번째 계층을 사용하여 공유 상태들도 음성 문맥의 유사도에 따라서 서로 다른 가중치들을 사용할 수 있다. 두 번째 제안된 방법 에서는 훈련 과정 즉, 혼합 분할 및 재추정 과정과 함께 음성 결정 트리가 계속 갱신되어 진다. 복수 혼합 결정 트리를 구성하기 위하여 단일 가우시안 뿐만 아니라 복수 혼합 가우시안 모델이 함께 사용된다. 제안된 방법들을 이용하여 BN-96과 WSJ5k 데이터를 사용한 연속 음성 인식 실험을 수행한 결과, 표준 결정 트리를 사용한 시스템과 비교하여 공유 상태의 개수를 비슷하게 유지하면서 단어 오인식률을 줄일 수 있었다.

  • PDF

Mieko Han의 한국어 음성학 연구 (Mieko Han and her Works on Korean Phonetics)

  • 고도흥
    • 음성과학
    • /
    • 제1권
    • /
    • pp.213-223
    • /
    • 1997
  • This paper deals with a general review of Mieko S. Han, who made a significant contribution to the studies of Korean phonetics during the 1960' s and early 1970' s. As both a single and joint author, Dr. Han published important papers in both quantity and quality, which have been cited among Korean phoneticians until today. Before Dr. M. Han' s work, professor of USC in the department of East Asian Languages & Cultures, there were only a few phonetics-related publications in Korea, most of which are papers or books based on non-experimental traditional approach. It is known that there was coexistence between traditionalism and structuralism in the field of Korean linguistics. It was, however, fortunate that we had two important phoneticians (M. Han and Chin-W Kim) abroad at that time. Mieko Han' s concern was to investigate experimental characteristics of the system of Korean vowels and consonants using a Spectrograph, which was the single most important tool for analysing phonetic data at that time. Dr. Han conducted her experimental studies on Korean phonetics, mostly funded by the Office of Naval Research, in terms of duration, fundamental frequency, Voice Onset Time (VOT), intensity, and so on. This paper aims to re-appreciate Dr. Han's specific contribution to the study of Korean phonetics since she played an important role as a pioneer of early Korean phonetics. Further, it is highly recommended that Dr. Han's works can be extremely useful for a graduate student, who seriously would like to specialize in Korean phonetics in the first step.

  • PDF

상태레벨 공유를 이용한 MLLR 적응화의 회귀클래스 생성에 관한 연구 (A Study on Regression Class Generation of MLLR Adaptation Using State Level Sharing)

  • 오세진;성우창;김광동;노덕규;송민규;정현열
    • 한국음향학회지
    • /
    • 제22권8호
    • /
    • pp.727-739
    • /
    • 2003
  • 본 논문에서는 HM-Net (Hidden Markov Network)을 다양한 태스크에의 적용과 화자의 특성을 효과적으로 나타내기 위해 HM-Net 음성인식 시스템에 MLLR (Maximum Likelihood Linear Regression) 적응방법을 도입하였으며, HM-Net 학습 알고리즘을 개량하여 회귀클래스 생성방법을 제안한다. 제안방법은 PDT-SSS (Phonetic Decision Tree-based Successive State Splitting)알고리즘의 문맥방향 상태분할에 의한 상태레벨 공유를 이용한 방법이다. 즉, 문맥방향의 각 상태에 적응화자 음성데이터에 포함된 문맥정보를 분할하여 적응화될 음소환경을 결정하는 것이다. 따라서 제안방법은 새로운 화자로부터 문맥정보와 적응화 데이터의 발성 양에 의존하여 결정된 많은 적응 파라미터들을 (평균, 분산) 자유롭게 제어할 수 있게 된다. 제안방법의 유효성을 확인하기 위해 국어공학센터 (KLE) 452 데이터와 항공편 예약관련 (YNU200) 연속음성을 대상으로 인식실험을 수행한 결과, 음소인식, 단어인식, 연속음성인식에 대해서, 평균 34∼37%, 평균 9%, 평균 20%의 성능 향상을 각각 보였다. 또한 적응화 데이터의 양에 따른 인식성능 비교에서 제안방법을 적용한 인식 시스템이 적응 데이터의 양이 적은 경우에도 향상된 인식률을 보여 MLLR 적응방법의 특성을 만족하였다. 따라서 MLLR 적응방법을 도입한 HM-Net 음성인식 시스템에 제안한 회귀클래스 생성방법이 유효함을 확인할 수 있었다.

지배음운론에서 본 'ㅡ'모음 (The Government Approach to the Eipty Nucleus)

  • 허용
    • 대한음성학회지:말소리
    • /
    • 제19_20호
    • /
    • pp.58-87
    • /
    • 1990
  • According to Government Phonology, at 1 phonological positions save the domain's head must be licensed in order to appear in the syllable structure. A non-nuclear head is licensed by the following nucleus, and the nuclei with phonetic content are licensed through government by the nuclear head of the domain at the level of the nuclear projection. Therefore, in the theory of Government Phonology it is claimed that words always end with a nucleus. With regard to the licensing of empty nuclei, Kaye(1990a) proposes the 'Empty Category Principle' and its sub-theory of 'Projection Government'. Government Phonology claims that a nucleus which dominates a vowel that regularly undergoes elision in certain contexts is underlyingly empty. This underlying empty nucleus is not manifested phonetically when it is properly governed by an unlicensed(i, e, a nucleus filled with a full vowel). It is when proper government fails to apply, that the empty nucleus is phonetically Interpreted. The purpose of this paper is to present a principled account of the process of $[i]{\Leftrightarrow}{\emptyset}$ alternation in Korean. Following Kaye's proposal, we assume that [i] of Korean is underlyingly empty. This position is pronounced as [i] if it is unlicensed, and is not phonetically realized if is licensed. Empty nuclei ape devided into two categories: domain-internal and domain-final. Firstly, we consider the question why Korean has little word ending with [i]. As for this, ECP states that domain-final empty nuclei are not pronounced if the language licenses domain-final empty nuclei. Whether a final empty nucleus may occur in the structure is parametric variation. This property is seen from the fact that words may appear to end in consonants in this language. Since Korean abounds with words ending in a consonant, it licenses domain-final empty nuclei. Therefore, it is quite natural that Korean has little word ending with [i]. Secondly, word-internal empty nuclei of Korean respect proper government and inter-onset government. That is, an empty nucleus in word-internal position will be pronounced with the vowel [i] if either proper government or inter-onset government fail to apply. Inter-onset government refers to the government established between two onsets across an empty nucleus. Thirdly, we consider words ending with [i], which seems to be exceptional to the final licensing. Host of them are. either mono-syllabic verbs(for instance, [s'i-] 'to write') or derived adjectives ending with [p'i] (for instance, [kip'i-] 'be happy'). As for the former, the 'inaccessibility for proper government' is applied because the empty nucleus appears in the first syllable. In latter case, domain-final empty nuclei are pronounced as [i] because of government-licensing. That is, final empty nucleus is pronounced to license the preceding onset dominating negatively charmed segments which empty nucleus of Korean cannot license.

  • PDF