Search | Korea Science

Compensation Method for Improvement of Speech Recognition in Wireless Communication Network (무선 통신망에서 음성인식률 개선을 위한 보상기법 연구)

Seo Jin-Ho;Park Ho-Chong
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.65-68
- /
- 2004
이동통신 기술의 발전으로 이동통신 사용이 폭발적으로 증가하였고 그에 따라 이동통신망을 이용한 많은 서비스가 제공되고 있다. 이동통신망에서의 음성 인식 서비스에서 음성 인식기에 입력되는 음성신호는 통신망을 통해 음성 압축기를 거치게 되고 이에 음성신호가 왜곡되어 인식기의 인식성능이 저하된다. 본 논문에서는 무선통신 환경에서 음성인식기의 성능을 개선하기 위한 보상 방법을 제안한다. 기존의 제안된 방법은 음성 데이터에 의존하는 방법을 사용하나 본 논문에서는 음성 데이터와는 독립적 방법인 음성 압축기에 의해 손상된 입력 신호의 스펙트럼 보상방법과 Cepstrum 보정방법을 통해 인식률을 향상시키는 방법을 제안한다. 즉, 음성 압축기에 의하여 왜곡된 스펙트럼을 단계적 방법으로 보상하고 그를 토대로 왜곡된 신호에서 만들어진 Cepstrum을 보정하여 음성 인식기의 성능을 향상시키는 방법을 연구하였으며, 그 견과 손상된 음성신호의 인식률 $64.88\%$에 대하여, 본 논문에서 제안하는 보상 방법을 적용한 음성신호의 인식률은 $79.73\%$로서 $14.85\%$가 향상된 결과를 얻을 수 있었다.
PDF

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

Min So-Hee;Kim Jin-Young;Choi Seung-Ho
- MALSORI
- /
- no.44
- /
- pp.83-92
- /
- 2002
Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.
PDF

Comparison & Analysis of Speech/Music Discrimination Features through Experiments (실험에 의한 음성·음악 분류 특징의 비교 분석)

Lee, Kyung-Rok;Ryu, Shi-Woo;Gwark, Jae-Young
- Proceedings of the Korea Contents Association Conference
- /
- 2004.11a
- /
- pp.308-313
- /
- 2004
In this paper, we compared and analyzed the discrimination performance of speech/music about combinations of each features parameter. Audio signals are classified into 3 classes (speech, music, speech and music). On three types of features, Mel-cepstrum, energy, zero-crossings used to the experiments. Then compared and analyzed the best of the combinations between features to speech/ music discrimination performance. The best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%).
PDF

Implementation and Performance Analysis of a Speaker Verification System (화자 확인 시스템의 설계 제작 및 성능 분석)

권석규;이병기
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.3
- /
- pp.1-9
- /
- 1993
This paper discusses issues on the disign and implementation of real-time automatic speaker verification system, as well as the performance analysis of the implemented system. The system employs TI's TMS320C25 digital signal processor TMS320C25 and high speed SRAMs. The system is designed to be used stand-alone as well as via hand-shaking with IBM-PC. The speech parameters used for speaker verification are PARCOR and LPC-cepstrum coefficients, and the employed decision logics are those based on the generalized weighted distance comcept. The implemented system showed the performance of 5.3% error rate for the PARCOR coefficient, and 4.7% error rate for the LPG-cepstrum coefficient.
PDF

Faults Detection in Hub Bearing with Minimum Variance Cepstrum (최소 분산 켑스트럼을 이용한 자동차 허브 베어링 결함 검출)

박춘수;최영철;김양한;고을석
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2004.05a
- /
- pp.593-596
- /
- 2004
Hub bearings not only sustain the body of a car, but permit wheels to rotate freely. Excessive radial or axial load and many other reasons can cause defects to be created and grown in each component. Therefore, vibration and noise from unwanted defects in outer-race, inner-race or ball elements of a Hub bearing are what we want to detect as early as possible. How early we can detect the faults has to do with how the detection algorithm finds the fault information from measured signal. Fortunately, the bearing signal has periodic impulse train. This information allows us to find the faults regardless how much noise contaminates the signal. This paper shows the basic signal processing idea and experimental results that demonstrate how good the method is.
PDF

Korean Vowel Recognition using Peripheral Auditory Model (말초 청각 계통 모델을 이용한 한국어 모음 인식)

Yun, Tae-Seong;Baek, Seung-Hwa;Park, Sang-Hui
- Journal of Biomedical Engineering Research
- /
- v.9 no.1
- /
- pp.1-10
- /
- 1988
In this study, the recognition experiments for Korean vowel are performed using peripheral auditory model. In addition, for the purpose of objective comparison, the recognition experiments are performed by extracting LPC cepstrum coefficients for the same speech data. The results are as follows. 1) The time and the frequency responses of the auditory model show that important features of input signal are involved in the responses of inner ear and auditory nerve. 2) The recognition results for Korean vowel show that the recognition rate by auditory model output is higher than the recognition rate by LPC cepstrum coefficients. 3) The adaptation phenomenon of auditory nerve provides useful characteristics for the discrimination of vowel signal.
PDF

Source localization of impact noise on an indoor unit of air-conditioner (에어컨 실내기에서 발생하는 충격 소음원의 위치 추정)

최영철;김양한;이종구;김구영
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2003.11a
- /
- pp.324-329
- /
- 2003
An air-conditioner has various noise sources such as cooling fan noise, pumping noise, flow noise and impact noise. Among these, impact noise is the most unpleasant source. This is because the noise is produced in indoor unit of air-conditioner. To control the noise source effectively, first we must identify the noise sources. When we identify impact noise source, the measurement have to be carried out simultaneously. So we use beamforming method that requires less measurement points than intensity method and acoustic holography. The objective of this paper is to estimate the location of impact source. This objective can be achieved by using minimum variance cepstrum that is able to detect impulse embedded in noise. In this study, modified beamforming method based on cepstrum domain is proposed. Then this method applied to air-conditioner noise sources which produce impact noise.
PDF

Classification of Pathological Voice Signal with Severe Noise Component

Li, Ta-O;Jo, Cheol-Woo
- Speech Sciences
- /
- v.10 no.4
- /
- pp.107-115
- /
- 2003
In this paper we tried to classify the pathological voice signal with severe noise component based on two different parameters, the spectral slope and the ratio of energies in the harmonic and noise components (HNR), The spectral slope is obtained by using a curve fitting method and the HNR is computed in cepstrum quefrency domain. Speech data from normal peoples and patients are collected, diagnosed and divided into three different classes (normal, relatively less noisy and severely noisy data), The mean values and the standard deviations of the spectral slope and the HNR are computed and compared with in the three kinds of data to characterize and classify the severely noisy pathological voice signals from others.
PDF

Voice conversion using low dimensional vector mapping (낮은 차원의 벡터 변환을 통한 음성 변환)

Lee, Kee-Seung;Doh, Won;Youn, Dae-Hee
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.35S no.4
- /
- pp.118-127
- /
- 1998
In this paper, we propose a voice personality transformation method which makes one person's voice sound like another person's voice. In order to transform the voice personality, vocal tract transfer function is used as a transformation parameter. Comparing with previous methods, the proposed method can obtain high-quality transformed speech with low computational complexity. Conversion between the vocal tract transfer functions is implemented by a linear mapping based on soft clustering. In this process, mean LPC cepstrum coefficients and mean removed LPC cepstrum modeled by the low dimensional vector are used as transformation parameters. To evaluate the performance of the proposed method, mapping rules are generated from 61 Korean words uttered by two male and one female speakers. These rules are then applied to 9 sentences uttered by the same persons, and objective evaluation and subjective listening tests for the transformed speech are performed.
PDF

EMG Pattern Recognition based on Evidence Accumulation for Prosthesis Control

Lee, Seok-Pil;Park, Sand-Hui
- Journal of Electrical Engineering and information Science
- /
- v.2 no.6
- /
- pp.20-27
- /
- 1997
We present a method of electromyographic(EMG) pattern recognition to identify motion commands for the control of a prosthetic arm by evidence accumulation with multiple parameters. Integral absolute value, variance, autoregressive(AR) model coefficients, linear cepstrum coefficients, and adaptive cepstrum vector are extracted as feature parameters from several time segments of the EMG signals. Pattern recognition is carried out through the evidence accumulation procedure using the distances measured with reference parameters. A fuzzy mapping function is designed to transform the distances for the application of the evidence accumulation method. Results are presented to support the feasibility of the suggested approach for EMG pattern recognition.
PDF

Search Result 274, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)