통합 검색 | Korea Science

Non-Negative Matrix Factorization을 이용한 음성 스펙트럼의 부분 특징 추출 (Parts-based Feature Extraction of Speech Spectrum Using Non-Negative Matrix Factorization)

박정원;김창근;허강인
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 신호처리소사이어티 추계학술대회 논문집
- /
- pp.49-52
- /
- 2003
In this paper, we propose new speech feature parameter using NMf(Non-Negative Matrix Factorization). NMF can represent multi-dimensional data based on effective dimensional reduction through matrix factorization under the non-negativity constraint, and reduced data present parts-based features of input data. In this paper, we verify about usefulness of NMF algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment result, we could confirm that proposal feature parameter is superior in recognition performance than MFCC(mel frequency cepstral coefficient) that is used generally.
PDF

음성신호와 전기성문파를 이용하는 새로운 매개변수 ; 성대 폐쇄 지연비율(Glottal Closure Delay Ratio) (New Parameter on Speech and EGG; Glottal Closure Delay Ratio)

최종민;권택균;정은정;이명철;김광현;성명훈;박광석
- 대한후두음성언어의학회지
- /
- 제18권1호
- /
- pp.22-25
- /
- 2007
Background and Objectives: Biomedical signals have been usually used for the diagnosis of the laryngeal function such as speech, electroglottograph(EGG), airflow and other signals. But, in most cases these signals were analysed separately. Here, we propose a new interchannel parameter Glottal Closure Delay Ratio(GCDR) which is estimated from speech and EGG measured simultaneously. Materials and Method: Speech and EGG signal were recorded simultaneously from 13 normal subjects, 39 patients. The patients' data included 16 polyps and 23 vocal folds palsy. Time difference between glottal closing instance on EGG and the first maximum peak on speech in a pitch period was calculated. Glottal closing instance was defined as the maximum peak on the first derivative of EGG signal(dEGG). Results: The standard deviation and jitter were calculated using 20-30 GCDRs extracted from each data, and they are significant different between normal and vocal fold paralysis group. Conclusion: The GCDR may be the first index reflecting speech and EGG characteristics and the perturbation of this parameter was significant different between normal and vocal fold paralysis group.
PDF

음성신호의 상위 포만트에 대한 ZCR-파라미터 검출에 관한 연구 (On a Detection of the ZCR-Parameter for Higher Formants of Speech Signals)

유건수
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1992년도 학술논문발표회 논문집 제11권 1호
- /
- pp.49-53
- /
- 1992
In many applications such as speech analysis, speech coding, speech recognition, etc., the voiced-unvoiced decision should be performed correctly for efficient processing. One of the parameters which are used for voice-unvoiced decision is zero-crossing. But the information of higher formants have not represented as the zero-crossing rate for higher formants of speech signals.
PDF

피치 동기된 에너지 유사도에 의한 음성신호의 전이구간 검출 (On a detecting the transition segments of speech signal by energ approximatio degree of the synchronized pitch)

김종득;박형빈;김대호;배명진
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 1998년도 하계종합학술대회논문집
- /
- pp.603-606
- /
- 1998
In a large number of words and the continued speech recognition system using a phoneme as teh recognition unit, it is necessary to segment processing. In this paper, a normalized AMDF new method. The suggested parameter represents a degree of sharpness at valley point. This method can detect the speech segment between the steady state and transient region to the continued speech without a prior information of speech signal.
PDF

음성 에너지계산에서 창함수-길이 변화영향의 개선에 관한 연구 (On Improving the Effects of Varying the Window Length on Speech Energy Computation)

배명진;안수길
- 한국음향학회지
- /
- 제9권2호
- /
- pp.34-41
- /
- 1990
음성신호의 전처리과정에서 에너지 퍼래미터는 음소의 변화특성을 나타내기 때문에 많이 사용하고 있다. 그렇지만 추출과정에서 창함수를 적용하기 때문에 창함수길이에 따른 영향을 받게된다. 본논문에서는 창함수길이에 따른 영향을 측정하고 그 영향을 최소화시키는 에너지추출법을 새로이 제안하였다. 이방법으로 추출된 에너지변화도는 창함수길이의 영향을 제거시켰기 때문에 음소의 변화특성을 잘나타낸다. 또한 계산시간은 샘플당 한번의 뺄셈과 덧셈, 그리고 두 번의 비교연산만 있으면 된다.
PDF

Implementation of Real-time Wheel Order Recognition System Based on the Predictive Parameters for Speaker's Intention

Moon, Serng-Bae;Jun, Seung-Hwan
- 한국항해항만학회지
- /
- 제35권7호
- /
- pp.551-556
- /
- 2011
In this paper new enhanced post-process predicting the speaker's intention was suggested to implement the real-time control module for ship's autopilot using speech recognition algorithm. The parameter was developed to predict the likeliest wheel order based on the previous order and expected to increase the recognition rate more than pre-recognition process depending on the universal speech recognition algorithms. The values of parameter were assessed by five certified deck officers being good at conning vessel. And the entire wheel order recognition process were programmed to TMS320C5416 DSP so that the system could recognize the speaker's orders and control the autopilot in real-time. We conducted some experiments to verify the usefulness of suggested module. As a result, we have confirmed that the post-recognition process module could make good enough accuracy in recognition capabilities to realize the autopilot being operated by the speech recognition system.
https://doi.org/10.5394/KINPR.2011.35.7.551 인용 PDF KSCI

규준화된 AMDF 이용한 음성파형의안정상태 구간검출 (On Detcdting the Steady State Segments of Speech Waveform by using the Normalized AMDF)

배명진;김을제;안수길
- 한국음향학회지
- /
- 제10권3호
- /
- pp.44-50
- /
- 1991
연속음 인식을 위해서는 음성신호의 음성학적 경계를 결정짓는 분할과정이 필요하다. 본 논문에서는 음성신호의 전이구간을 결정하기 위한 퍼래미터로 한 프레임내의 규준화된 AMDF을 제안하였다. 제안된 규준화된 AMDF은 그 프레임에서 음성진폭의 변화율을 대별하며, 인근 프레임의 규준화된 AMDF와 비교하면 현재의 프레임이 정상상태 혹은 전이영역에 있는지를 구별할 수 있게 해준다.
PDF

화자 종속 알고리즘을 이용한 음성 인식 보안 시스템 구현 (Implementation of Speech Recognition Security System Using Speaker Defendent Algorithm)

김영현;문철홍
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2003년도 신호처리소사이어티 추계학술대회 논문집
- /
- pp.65-68
- /
- 2003
In this paper, a speech recognition system using a speaker defendant algorithm is implemented on the PC. Results are loaded on a LDM display system that employs Intel StrongArm SA-1110. This research has completed so that this speech recognition system may correct its shortcomings. Sometimes a former system is operated by similar speech, not a same one. To input a vocalization is processed two times to solve mentioned defects. When references are creating, variable start-point and end-point are given to make efficient references. This references and new references are changed into feature parameter, LPC and MFCC. DTW is excuted using feature parameter. This security system will give user permission under fore execution have same result.
PDF

Fuzzy Rule Base를 이용한 한국어 연속 음성인식 (A Korean Speech Recognition Using Fuzzy Rule Base)

송정영
- 공학논문집
- /
- 제2권1호
- /
- pp.13-21
- /
- 1997
본 연구는 연속음성을 인식하기 위하여 특징 Parameter의 변동성을 Fuzzy 변수로 취하여 Membership 함수로 표현한 후, Fuzzy 추론으로 연속음성을 인식하는 연구이다. 특징 Parameter로는 Formant 주파수, Pitch, Log Energy, Zero Crossing Rate등을 사용한다. 연속음성의 Data로서는 한국어의 연속음성을 대상으로 하여 음성인식 system을 구현한다음, 인식실험을 통하여 본 연구의 유교성을 확인한다.
PDF

음성 신호의 음소 단위 구분화에 관한 연구 (A Study on the Segmentation of Speech Signal into Phonemic Units)

이의천;이강성;김순협
- 한국음향학회지
- /
- 제10권4호
- /
- pp.5-11
- /
- 1991
본 연구에서는 음성신호의 음소 단위 구분화 방법을 제안한다. 제안된 구분화 시스템은 화자 독립적이고, 음성신호에 대한 사전 정보 없이도 음소 단위로 구분화를 수행할 수 있는 특징을 갖는다. 구분화 처리는 입력 음성신호를 먼저 순수 유성을 구간과 순수 유성음이 아닌 구간으로 분리 시킨 후, 각각의 구간에 대해 세분화된 음소 단위로 분리시키는 2단계 구분화 알고리즘을 적용하였고, 이때 사용된 파라미터는 유성을 검출 파라미터, 영차 LPC 캡스트럼 계수의 시간변호 파라미터, ZCR 파라미터이다. 본 연구에서 제안한 구분화 알고리즘의 유용성을 입증하기 위해 사용한 대상어는 고립단어와 연속음성으로 구성된 어휘로서 전체 어휘중에 포함된 507개 음소에 대한 구분화율은 91.7% 이다.
PDF

검색결과 373건 처리시간 0.021초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)