Search | Korea Science

A Study on Continuous Digits Speech Recognition using Probabilistic Models (확률적 모델을 이용한 연속 숫자음 인식에 관한 연구)

Lee Ju-Sung;Lee Seong-Kwon;Kim Soon-Hyob
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.109-112
- /
- 1999
본 연구는 음소 단위의 CHMM(Continuous Hidden Markov Model)을 이용한 한국어 연속 음성인식에 관한 내용이다. 연구실 환경에서 음성으로 전화를 걸기 위하여 연속 숫자음 인식을 수행하였다. ETRI 445 데이터를 사용하여 초기의 모델은 ML(Maximum Likelihood) 추정법을 이용하여 작성하였고 적응화를 위해 최대 사후 확률 추정법을 사용하였다. 연속 숫자음의 인식을 위하여 한국어 숫자음 음성의 음향학적 특성을 고려하여 발성 사전을 작성하였고, 음절 단위로 되어있는 한국어 숫자음의 모든 경우를 고려하여 복수개의 단어를 사전에 등록하였다. 또한 숫자음의 알 뒤 연음현상을 고려하여 작성한 21 종류의 7자리 숫자음과 이를 음절 단위로 세그먼트한 숫자음을 DB로 사용하여 적응화를 수행하였다. 이의 효율성을 입증하기 위하여 ETRI에서 작성한 35종류의 4연속 숫자음 목록을 대상으로 인식실험을 수행하였다.
PDF

A Model of Speech Database in Korean in consideration of its segmental phonology (국어 분절음 특성에 맞는 음성 데이터 베이스의 모형)

김종미
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.297-302
- /
- 1994
본 논문에서는 국어 분절음 특성에 맞는 음성 데이터베이스의 모형을 제시하고자 한다. 음성 데이터 베이스는 1) 각 음의 고유음가정보, 2) 인접음 정보, 3) 빈도수에 따른 확률정보를 포함해야 한다. 이 요건을 충족시키기 위해 본 모형은 1) 음운 단위별로 Labeling 하여, 고유음과 인접음 정보를 편집하고, 2) 음운 규칙과 제약정보에 의해 Phoneme Balanced Words를 작성하여, 허용되는 인접음을 취하고, 허용되지 않는 인접음을 탈락시키며 3) 시스템 평가시, 빈도수가 shb은 음과 음소열의 우선적인 인식 및 합성을 우월하게 평가한다는 고정서, 4) 데이터 집적시, 데이터의 음운기능의 중복과 편중을 피함으로서 데이터량을 간소화할 수 있다는 경제성을 들 수 있다.
PDF

A Study on the Phonemic Analysis for Korean Speech Segmentation (한국어 음소분리에 관한 연구)

Lee, Sou-Kil;Song, Jeong-Young
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.4E
- /
- pp.134-139
- /
- 2004
It is generally known that accurate segmentation is very necessary for both an individual word and continuous utterances in speech recognition. It is also commonly known that techniques are now being developed to classify the voiced and the unvoiced, also classifying the plosives and the fricatives. The method for accurate recognition of the phonemes isn't yet scientifically established. Therefore, in this study we analyze the Korean language, using the classification of 'Hunminjeongeum' and contemporary phonetics, with the frequency band, Mel band and Mel Cepstrum, we extract notable features of the phonemes from Korean speech and segment speech by the unit of the phonemes to normalize them. Finally, through the analysis and verification, we intend to set up Phonemic Segmentation System that will make us able to adapt it to both an individual word and continuous utterances.
PDF KSCI

Text Independent Speaker Identification Using Separate Matrix Quantization (분할 매트릭스 부호화를 이용한 문장 독립형 화자인식 시스템)

경연정;이황수
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.5
- /
- pp.69-72
- /
- 1998
본 논문에서는 문장독립형 화자인식 시스템에 MQ(Matrix Quantization) 방법 사용 을 제안한다. 또한 인식율을 높이기 위해 MQ를 수정한 방법인 SMQ(Separated Matrix Quantization)을 제안한다. 기존의 VQ-distortion 방법은 대체로 좋은 성능을 가지나 화자의 동적 특성을 이용하지 못한다는 단점이 있다. MQ와 SMQ는 화자의 동적 특성을 이용할 수 있으므로 시간 변화에 대한 화자의 특징 변화까지 모델링 할 수 있는 장점이 있다. MQ는 여러 프레임을 묶어 Matrix Codebook을 가지며 SMQ는 MQ의 기본 codebook을 다시 켑스 트럼의 차수에 따라 나누어 codebook을 만든다. 즉, 켑스트럼 차수를 저, 중, 고차로 나누어 각 부분별로 Matrix codebook을 만들도록 한다. 인식실험은 문장독립 음성 데이터에 대해 실행했으며 MQ모델의 경우 Matrix의 크기를 짧은 음소크기부터 음절단위까지 변화시켜 실 험하였다. 아울러 SMQ 모델에서의 실험은 차수별 유용도를 보기 위하여 부분 차수를 이용 하여 실험하였다. 실험결과 MQ와 SMQ방법이 VQ에 비해 좋은 성능을 가짐을 확인하였다.
PDF

Phonological Awareness Activities Using Story Books : Effects on Reading, Self-Concept, and Learning Motivation in an After-School Program for 1st and 2nd Grade Low Income Children (동화를 이용한 음운인식활동이 저소득층 초등 방과후 교실 1, 2 학년 아동의 읽기, 학습동기 및 자아개념에 미치는 영향)

Lee, Jeehyun;Kim, Youjung;Lee, Jung A
- Korean Journal of Child Studies
- /
- v.27 no.5
- /
- pp.123-141
- /
- 2006
The phonemic awareness program included construction of 45 activities emphasizing various sounds in speech and letter names using a storybook. The subjects were thirty 1st and 2nd grade low-income(15 experimental and 15 control group) children attending an after-school program in Seoul. Pre- and post-tests assessed children's reading, self-concept, and learning motivation. The experimental group children had rich opportunity to deal with and discuss sounds, syllables, phonemes, and the Korean alphabet names during storybook reading, games, and play over a 12 week period, while the control group children were provided with worksheets, subject tutoring, and homework guidance. Results showed that the phonemic activities were an effective and useful way to enhance children's reading ability, self-concept, and learning motivation.
PDF

A comparison of techniques for measuring intelligibility of dysarthric speech : toward phonetic intelligibility testing in dysarthria. (뇌성마비 성인의 음소대조 낱말명료도와 문장명료도)

Kim Soo-Jin
- Proceedings of the KSPS conference
- /
- 2002.11a
- /
- pp.141-144
- /
- 2002
The relations between words intelligibility and sentences intelligibility were tested on adults with cerebral palsy(athetoid type). Intelligibility is used as an important evaluation value in the field of diagnosis and therapy of dysarthric patients. In order to develop one syllable phonetic contrast intelligibility test using specific phonetic contrasts, the correlation with sentences intelligibility was tested to find out the validity. Pearson's simple correlation coefficient was .83 that shows a high correlation. Also, comparing the range and standard deviation given by seven evaluators on each subject, it was shown that when evaluating patients of moderate intelligibility, words intelligibility was more reliable than sentences intelligibility.
PDF

A phoneme duration modeling in a speech recognition system based on decision tree state tying (결정트리기반 음성인식 시스템에서의 음소지속시간 사용방법)

Koo Myoun-Wan;Kim Ho-Kyoung
- Proceedings of the KSPS conference
- /
- 2002.11a
- /
- pp.197-200
- /
- 2002
In this paper, we propose a phoneme duration modeling in a speech recognition system based on disicion tree state tying. We assume that phone duration has a Gamma distribution. In a training mode, we model mean and variance of each state duration in context-independent phone model based on decision tree state tying. In a recognition mode, we get mean and variance of each context-dependent phone duration form state duration information obtaind during training mode. We make a comparative study of the proposed meth with conventinal methods. Our method results in good performance compared with conventional methods.
PDF

Morpheme Graph Generation with HMM based Continuous Speech Recognition (HMM에 기반한 연속음성인식에서의 형태소 그래프 생성)

Choi, Joon-Ki;Lee, Geun-Bae;Lee, Jong-Hyeok
- Annual Conference on Human and Language Technology
- /
- 1997.10a
- /
- pp.500-504
- /
- 1997
본 논문에서는 형태소 그래프를 정의하고 이를 한국어 연속 음성 인식의 결과로서 사용함과 동시에 한국어의 자연어 처리를 위한 지식 표현 방법으로 사용한다. 또한 형태소 그래프를 연속 음성 인식과정에서 효율적으로 생성하는 알고리즘으로서 Tree-Trellis 탐색 알고리즘을 소개한다. 한국어 연속 음성 인식기는 HMM 인식기를 사용하며 탐색 알고리즘 또한 HMM 음소 인식기의 사용을 전제로 한다. 실험 DB로는 한국과학기술원 통신연구실에서 제작한 3000 단어급의 무역상담관련 DB를 사용하였다.
PDF

Generating Pronunciation Lexicon for Continuous Speech Recognition Based on Observation Frequencies of Phonetic Rules (음소변동규칙의 발견빈도에 기반한 음성인식 발음사전 구성)

Na, Min-Soo;Chung, Min-Hwa
- MALSORI
- /
- no.64
- /
- pp.137-153
- /
- 2007
The pronunciation lexicon of a continuous speech recognition system should contain enough pronunciation variations to be used for building a search space large enough to contain a correct path, whereas the size of the pronunciation lexicon needs to be constrained for effective decoding and lower perplexities. This paper describes a procedure for selecting pronunciation variations to be included in the lexicon based on the frequencies of the corresponding phonetic rules observed in the training corpus. Likelihood of a phonetic rule's application is estimated using the observation frequency of the rule and is used to control the construction of a pronunciation lexicon. Experiments with various pronunciation lexica show that the proposed method is helpful to improve the speech recognition performance.
PDF

A Study on Neural Networks for Korean Phoneme Recognition (한국어 음소 인식을 위한 신경회로망에 관한 연구)

최영배
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1992.06a
- /
- pp.61-65
- /
- 1992
This paper presents a study on Neural Networks for Phoneme Recognition and performs phoneme recognition using TDNN(Time Delay Neural Network). Also, this paper proposes new training algorithm for speech recognition using neural nets that proper to large scale TDNN. Because phoneme recognition is indispensable for continuous speech recognition, this paper uses TDNN to get accurate recognition result of phoneme. And this paper proposes new training algorithm that can converge TDNN to optimal state regardless of the number of phoneme to be recognized. The result of recognition on three phoneme classes shows recognition rate of 9.1%. And this paper proves that proposed algorithm is a efficient method for high performance and reducing convergence time.
PDF

Search Result 529, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)