통합 검색 | Korea Science

고립단어 음성인식에서 신경망을 이용한 사용자 적응형 후처리 (User Adjustment Post-Process Using Neural Network In Isolated Word Speech Recognition)

김영진;김은주;김명원
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2005년도 가을 학술발표논문집 Vol.32 No.2 (2)
- /
- pp.736-738
- /
- 2005
최근 PDA나 PMP와 같은 개인용 모바일 기기의 인터페이스 개발로써 잡음환경에 강인한 음성인식 기술들이 연구되고 있으며 이러한 방법으로 오류패턴, 순차패턴, 의미정보, 문맥정보와 같이 인식기에 독립적인 정보를 이용하거나 영상 정보와 같이 언어와 성격이 다른 이질적인 정보를 이용하여 후처리를 하는 연구들이 진행되어 왔다. 그러나 인식기와 독립적인 정보로 후처리를 하는 방법들의 인식률은 인식기의 사전 인식률이 주변 잡음에 의해 떨어질 경우 후처리 인식률도 같이 떨어지는 현상이 벌어진다. 따라서 본 논문에서는 주변 잡음으로 인한 인식기의 사전 인식률에 저하를 줄이는 방법으로 사용자 적응형 후처리를 제안한다. 사용자 적응형 후처리에 사용되는 데이터는 사용자의 발화에 대한 인식기의 출력 값들이며, 출력 값들은 화자독립모델에 의해 계산되는 각 단어들의 유사도 들이다. 따라서 화자독립모델의 결과를 사용자 적응형 후처리에 적용한 결과 인식기의 오류를 $58.7\%$ 줄일 수 있었다.
PDF

역전파 학습 신경망을 이용한 고립 단어 인식시스템에 관한 연구

김중태
- 한국통신학회논문지
- /
- 제15권9호
- /
- pp.738-744
- /
- 1990
본 논문은 음성신호의 실시간 저장법과 기존 표본 데이터에서 개선된 표본 데이터 방법을 제안하여, 신경회로망의 역전파 학습 알고리즘을 이용한 고립 단어 음성인식 시스템에 대하여 연구하였다. 각 층의 노드 수 변화에 의한 기존 표본 데이터방식과 새로운 표본 데이터 방식에서의 인식률과 에러율 변화를 비교하였다. 본 연구 결과, 인식률은 95.1%를 얻었다.
PDF

상태의 고유시간 정보를 포함하는 Hidden Markov Model (Hidden Markov Models Containing Durational Information of States)

조정호;홍재근;김수중
- 대한전자공학회논문지
- /
- 제27권4호
- /
- pp.636-644
- /
- 1990
Hidden Markov models(HMM's) have been known to be useful representation for speech signal and are used in a wide variety of speech systems. For speech recognition applications, it is desirable to incorporate durational information of states in model which correspond to phonetic duration of speech segments. In this paper we propose duration-dependent HMM's that include durational information of states appropriately for the left-to-right model. Reestimation formulae for the parameters of the proposed model are derived and their convergence is verified. Finally, the performance of the proposed models is verified by applying to an isolated word, speaker independent speech recognition system.
PDF

소프트컴퓨팅 기법을 이용한 다음절 단어의 음성인식 (Speech Recognition of Multi-Syllable Words Using Soft Computing Techniques)

이종수;윤지원
- 정보저장시스템학회논문집
- /
- 제6권1호
- /
- pp.18-24
- /
- 2010
The performance of the speech recognition mainly depends on uncertain factors such as speaker's conditions and environmental effects. The present study deals with the speech recognition of a number of multi-syllable isolated Korean words using soft computing techniques such as back-propagation neural network, fuzzy inference system, and fuzzy neural network. Feature patterns for the speech recognition are analyzed with 12th order thirty frames that are normalized by the linear predictive coding and Cepstrums. Using four models of speech recognizer, actual experiments for both single-speakers and multiple-speakers are conducted. Through this study, the recognizers of combined fuzzy logic and back-propagation neural network and fuzzy neural network show the better performance in identifying the speech recognition.
PDF KSCI

웨이브렛 변환을 이용한 음성신호의 끝점검출 (Endpoint Detection of Speech Signal Using Wavelet Transform)

석종원;배건성
- 한국음향학회지
- /
- 제18권6호
- /
- pp.57-64
- /
- 1999
본 논문에서는 잡음이 포함된 음성의 시작점과 끝점을 효율적으로 검출할 수 있는 알고리듬에 대하여 연구하였다. 이를 위해, 웨이브렛 영역에서의 에너지 분포를 고려함으로써 잡음환경하에서도 음성을 검출할 수 있는 새로운 검출 파라미터를 제안하였다. 제안된 끝점검출 파라미터는 웨이브렛 영역에서 세 번째 coarsed 스케일의 표준편차와 가중치를 곱한 첫 번째 detailed 스케일의 표준편차의 합으로 정의하였다. 제안된 끝점검출기의 성능평가를 위해서 다양한 SNR에서 기존방식과 비교하여 시작점과 끝점의 정확도 실험을 수행하였고 HMM 음성인식시스템을 이용하여 인식실험도 수행하였다.
PDF

다층퍼셉트론의 강하 학습을 위한 최적 학습률 (Optimal Learning Rates in Gradient Descent Training of Multilayer Perceptrons)

오상훈
- 한국콘텐츠학회논문지
- /
- 제4권3호
- /
- pp.99-105
- /
- 2004
이 논문은 다층퍼셉트론의 학습을 빠르게 하기 위한 최적 학습률을 제안한다. 이 학습률은 한 뉴런에 연결된 가중치들에 대한 학습률과, 중간층에 가상의 목표값을 설정하기 위한 학습률로 나타난다. 그 결과, 중간층 가중치의 최적 학습률은 가상의 중간층 목표값 할당 성분과 중간층 오차함수를 최소화 시키고자하는 성분의 곱으로 나타난다. 제안한 방법은 고립단어인식과 필기체 숫자 인식 문제의 시뮬레이션으로 효용성을 확인하였다.
PDF

A Novel Integration Scheme for Audio Visual Speech Recognition

Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
- 한국음향학회지
- /
- 제28권8호
- /
- pp.832-842
- /
- 2009
Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.
https://doi.org/10.7776/ASK.2009.28.8.832 인용 PDF KSCI

리그닌 화학구조 모델의 역사적 고찰 (Historical Consideration of Lignin Models for Native Lignin Structure)

황병호
- 임산에너지
- /
- 제23권1호
- /
- pp.45-68
- /
- 2004
The word of lignin is derived from the Latin word 'ligum' meaning wood. Lignin is complex polymer consisting of coniferyl alcohol, sinapyl alcohol and p-coumaryl alcohol unit and has an amorphous, three dimensional network structure which is hard to be hydrolyzed by acid. Lignin is found in the cell wall of plants lignified. The mode of polymerization of these alcohols in the cell wall lead to a heterogeneous branched and cross-linked polymer in which phenyl propane units are linked by carbon-carbon and carbon-oxygen bonds. This polymerization of precursors, p-coumaryl alcohol, coniferyl alcohol and sinapyl alcohol to lignin is formed by enzymic dehydrolyzation. The reaction is initiated by an electron transfer which results in the formation of resonance-stabilized phenoxy radical. The combination of these radicals produces a variety of dimers, trimers and oligomers and so on. Lignin research has been divided into basic and practical application field. The basic studies contains biosynthesis, chemical structure, distribution in the cell wall and reactivity by reductants, oxidants and organic solvents. The application research will be approached the reaction of lignin in various pulp making involving pulp bleaching and its effect on pulp qualities. Lignin also will be studied for the production of fine chemicals, polymer products and the conservation into an energy source like petroleum oil because the amount of lignin produced in pulp making process is more than 51,000,000 tons per year in the world. Both basic and application research must lay emphasis on the development for the utilization of lignin and the pulping process. But these researches can not be completed without understanding lignin structure containing functional groups. Therefore, this paper was focused on the review of lignin formulation which has been studied since 1948 in chronological order. This review was based on monomers, dimers, trimers and tetramers of phenyl propane unit structures which were isolated and identified by different methods from various wood.ious wood.
PDF

FSVQ, 퍼지 개념 및 이중 스펙트럼 특징을 이용한 HMM에 기초를 둔 음성 인식 (HMM-based Speech Recognition using FSVQ, Fuzzy Concept and Doubly Spectral Feature)

정의봉
- 한국컴퓨터산업학회논문지
- /
- 제5권4호
- /
- pp.491-502
- /
- 2004
본 논문은 화자 독립의 단독어 인식에 관한 연구로써, FSVQ(first section vector quantization), 퍼지 이론 및 이중 스펙트럼 특징을 이용한 HMM(hidden Markov model) 모델을 제안한다. 제안된 연구 방법에서, 이중 특징 파라메타로써 LPC ？스트럼과 LPC 스트럼의 회귀 계수를 사용한다. 학습 데이터는 몇 개의 구간으로 나누어지며, 첫 번째 구간의 코드북(codebook)을 만든 후, 첫 번째 구간의 코드북으로 부터, 퍼지 개념을 도입하여 확률 값이 큰 순서에 의해 다중 관측열을 구한다. 그 다음, 첫 번째 구간의 관측열을 학습시키고, 같은 방법으로 확률 값을 얻은 단어가 인식되어 진다. 제안된 방법에 의한 인식 실험을 수행하는 것 이외에도 비교를 위하여 다른 방법의 인식 실험을 같은 조건하에서 같은 데이터로 수행하였다. 실험 결과, 본 연구에서 제안한 방법이 다른 방법들보다 인식률이 우수함을 입증하였다. 입증하였다.
PDF

스마트 홈 환경에서 사용자 상황정보 기반의 음성 인식 시스템 개발 (Development of Speech Recognition System based on User Context Information in Smart Home Environment)

김종훈;심재호;송창우;이정현
- 한국콘텐츠학회논문지
- /
- 제8권1호
- /
- pp.328-338
- /
- 2008
현재 높은 인식성능을 보이고 있는 대용량의 음성인식 시스템의 대부분은 고립단어 음성인식 시스템이다. 이러한 시스템의 인식범위를 늘리기 위해서는 검색하려는 단어수를 늘려야한다. 하지만 검색하려는 단어수가 늘어남에 따라 시스템의 속도 및 인식 성능이 저하되는 문제점이 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 스마트 흠 환경에서 음성인식 성능에 영향을 주는 상황정보를 정의하고 관성 센서와 RFID(Radio Frequency Identification)를 사용한 사용자 위치 추정 방법을 제안한다. 또한 음성인식시스템의 상황정보에 의한 단어모델 도메인을 구축하여 기존의 시스템보다 높은 성능을 보이는 음성인식 시스템을 개발한다. 스마트 흠 환경에서 본 연구에서 제안된 음성 인식 시스템이 인식률의 저하 없이 동작하는 것을 확인하였다.
https://doi.org/10.5392/JKCA.2008.8.1.328 인용 PDF

검색결과 156건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)