• 제목/요약/키워드: Speech Signal

검색결과 1,172건 처리시간 0.03초

인간의 청각모델에 기초한 잡음환경에 적응된 잡음억압 시스템 (Adaptive Noise Suppression system based on Human Auditory Model)

  • 최재승
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2008년도 춘계종합학술대회 A
    • /
    • pp.421-424
    • /
    • 2008
  • 본 논문에서는 다양한 배경잡음에 의해 열화된 음성을 강조하기 위하여 청각모델에 기초로 한 잡음환경에 적응된 잡음억압 시스템을 제안한다. 제안한 시스템은 먼저 유성음과 무성음의 구간을 검출한 후, 각 입력 프레임에서 적응적인 청각기강의 처리를 한다. 마지막으로 진폭성분과 위상성분이 포함된 신경회로망을 사용하여 잡음신호를 제거한 후에 음성을 강조하는 처리를 한다. 본 시스템은 신호대잡음비의 평가방법을 통하여 다양한 잡음에 의해서 열화된 음성신호에 대해서 유효하다는 것을 실험으로 확인한다.

  • PDF

Split Model Speech Analysis Techniques for Wideband Speech Signal

  • Park YoungHo;Ham MyungKyu;You KwangBock;Bae MyungJin
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1999년도 학술발표대회 논문집 제18권 1호
    • /
    • pp.20-23
    • /
    • 1999
  • In this paper, The Split Model Analysis Algorithm, which can generate the wideband speech signal from the spectral information of narrowband signal, is developed. The Split Model Analysis Algorithm deals with the separation of the $10^{th}$ order LPC model into five cascade-connected $2^{nd}$ order model. The use of the less complex $2^{nd}$ order models allows for the exclusion of the complicated nonlinear relationships between model parameters and all the poles of the LPC model. The relationships between the model parameters and its corresponding analog poles is proved and applied to each $2^{nd}$ order model. The wideband speech signal is obtained by changing only the sampling rate

  • PDF

Split Model Speech Analysis Techniques for Speech Signal Enhancement

  • Park, Young-Ho;You, Kwang-Bock;Bae, Myung-Jin
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.1135-1138
    • /
    • 1999
  • In this paper, The Split Model Analysis Algorithm, which can generate the wideband speech signal from the spectral information of narrowband signal, is developed. The Split Model Analysis Algorithm deals with the separation of the 10$\^$th/ order LPC model into five cascade-connected 2$\^$nd/ order model. The use of the less complex 2$\^$nd/ order models allows for the exclusion of the complicated nonlinear relationships between model parameters and all the poles of the LPC model. The relationships between the model parameters and its corresponding analog poles is proved and applied to each 2$\^$nd/ order model. The wideband speech signal is obtained by changing only the sampling rate.

  • PDF

Noise Reduction Using the Standard Deviation of the Time-Frequency Bin and Modified Gain Function for Speech Enhancement in Stationary and Nonstationary Noisy Environments

  • Lee, Soo-Jeong;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • 제26권3E호
    • /
    • pp.87-96
    • /
    • 2007
  • In this paper we propose a new noise reduction algorithm for stationary and nonstationary noisy environments. Our algorithm classifies the speech and noise signal contributions in time-frequency bins, and is not based on a spectral algorithm or a minimum statistics approach. It relies on calculating the ratio of the standard deviation of the noisy power spectrum in time-frequency bins to its normalized time-frequency average. We show that good quality can be achieved for enhancement speech signal by choosing appropriate values for ${\delta}_t\;and\;{\delta}_f$. The proposed method greatly reduces the noise while providing enhanced speech with lower residual noise and somewhat higher mean opinion score (MOS), background intrusiveness (BAK) and signal distortion (SIG) scores than conventional methods.

주파수 영역에서의 고립단어에 대한 음성 특징 추출 (Speech Feature Extraction for Isolated Word in Frequency Domain)

  • 조영훈;박은명;강홍석;박원배
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(4)
    • /
    • pp.81-84
    • /
    • 2000
  • In this paper, a new technology for extracting the feature of the speech signal of an isolated word by the analysis on the frequency domain is proposed. This technology can be applied efficiently for the limited speech domain. In order to extract the feature of speech signal, the number of peaks is calculated and the value of the frequency for a peak is used. Then the difference between the maximum peak and the second peak is also considered to identify the meanings among the words in the limited domain. By implementing this process hierarchically, the feature of speech signal can be extracted more quickly.

  • PDF

음원 모델에 기초한 합성음의 피치 조절 (Pitch Modification based on a Voice Source Model)

  • 최용진;여수진;김진영;성굉모
    • 음성과학
    • /
    • 제3권
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

A Novel Integration Scheme for Audio Visual Speech Recognition

  • Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
    • 한국음향학회지
    • /
    • 제28권8호
    • /
    • pp.832-842
    • /
    • 2009
  • Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.

윈도우의 영향이 제거된 에너지 파라미터에 관한 연구 (A Study of Energy Parameter without Windowing Influence in Speech Signal)

  • 조태수;신동성;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(4)
    • /
    • pp.277-280
    • /
    • 2001
  • The preprocessing is very important course in speech signal processing. It influence the compression-rate in speech coding and the recognition-rate in speech recognition etc. In this paper, we propose that minimizing window-influence method with pitch period and start points. The proposed method is available for voiced detection and word labeling.

  • PDF

Variable LPF에 의한 피치검출 (The Pitch Detection Using Variable LPF)

  • 백금란
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1993년도 학술논문발표회 논문집 제12권 1호
    • /
    • pp.88-92
    • /
    • 1993
  • In speech signal processing, it is necessary to detect exactly the pitch. The algorithms of pitch extraction which have been proposed until now are difficult to detect pitches over wide range speech signals. Thus we propose a new algorithm which uses the G-peak extraction to do it. It is the method that finds the most MZI(maximum zero-crossing interval) at each frame and convolve it with speech signal ; this is the same with passing speech signals to variable LPF. Finally we obtained the pitch, improve the accuracy of pitch detection and extract it with the high speed.

  • PDF

Analysis of Speech Signals Depending on the Microphone and Micorphone Distance

  • Son, Jong-Mok
    • The Journal of the Acoustical Society of Korea
    • /
    • 제17권4E호
    • /
    • pp.41-47
    • /
    • 1998
  • Microphone is the first link in the speech recognition system. Depending on its type and mounting position, the microphone can significantly distort the spectrum and affect the performance of the speech recognition system. In this paper, characteristics of the speech signal for different microphones and microphone distances are investigated both in time and frequency domains. In the time domain analysis, the average signal-to-noise ration is measure ration is measured for the database we collected depending on the microphones and microphone distances. Mel-frequency spectral coefficients and mel-frequency cepstrum are computed to examine the spectral characteristics. Analysis results are discussed with our findings, and the result of recognition experiments is given.

  • PDF