• 제목/요약/키워드: Cepstrum Analysis

검색결과 91건 처리시간 0.026초

한국어 마찰음 및 파찰음의 분석과 인식 (Analysis and Recognition of Korean Fricatives and Affricates)

  • 정석재;정현열;이무영
    • 한국음향학회지
    • /
    • 제10권5호
    • /
    • pp.27-35
    • /
    • 1991
  • 음소를 인식의 기본 단위로 하는 소규모 음성 인식 시스템을 구현하기 위한 기초 연구로서 마 찰음(/ㅅ, ㅆ, ㅎ/) 과 파찰음(/ㅈ, ㅉ, ㅊ/) 에 대하여 지속시간, 평균패턴, 분산비를 이용하여 각 음소 의 특징을 분석하고 각 음소군 내에서의 식별에 유효한 parameter들을 추출하여 인식 실험을 실시하 였다. 지속시간의 분포, 평균패턴의 분포, 분산비의 분포를 이용하여 분석한 결과 6차원 정도의 cepstrum 계수만으로 마찰음 및 파찰음의 식별이 가능하고, 시간 방향의 정보는 음성의 시단으로부터 14 frame 정도의 특징을 인식 파라미터로 할 경우가 최적임을 알 수 있었다. 이를 이용한 인식실험 결과에서는 조음방법별로 분류된 음소군내의 각 음소에 대한 인식실험의 인식률 보다는 발음방법별 인식실험시의 인식률이 높게 나타나 동일 음소군 내에서의 각 음소에 대한 식별이 더 어려움을 알 수 있었고, 특징 파라미터의 길이를 음성의 시단으로부터 14 frame 정도로 했을 때 조음방법별 인식률은 평균 81.1%, 발음방법별 인식률은 평균 97.9%로 최고의 인식률을 나타내었다. 특징 파라미터의 길이 를 14 frame 이상으로 증가시켜도 인식률은 큰 변화가 없어 분석 결과를 잘 설명하고 있음을 알 수 있었다.

  • PDF

움직임 열화 현상이 발생하고 노이즈가 첨가된 영상의 분석과 파라메터 추출 알고리즘 (Analysis and parameter extraction algorithm of noisy motion blurred image)

  • 최병철;최지웅;강문기
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 1998년도 학술대회
    • /
    • pp.87-90
    • /
    • 1998
  • 움직임 열화(motion blur)현상은 카메라와 피사체간의 상대적임 움직임에 발생되는 영상의 번짐 현상으로, 본 논문에서는 새롭게 제시한, 노이즈의 분산을 산출해 내기 위한 노이즈 지배영역과, 움직임 열화와 각도와 길이를 추정해내기 위한 신호 지배영역을 통하여 움직임 열화의 파라메터를 효율적으로 추정할 수 있는 방법을 제시하였다. 또한, 새롭게 제안한 가변가중치(weight)를 적용한 최소자승법(Least Man Square)은 극점 자취의 방향 추정에 있어 정밀한 측정이 가능케 한다. 열화의 방향이 얻어지면, 1차원 셉스트럼(Cepstrum)방법으로 빠르게 움직임 열화의 길이를 구할 수 있게 된다. 이러한 방법으로 얻어진 정보들을 이용하여, 실제 손상되어진 영상을 효과적으로 복원할 수 있었다.

  • PDF

정사각형 외팔보의 비평면 진동현상 (Nonplanar vibration Phenomenon of the Quadrangle Cantilever Beam)

  • 김명구;박철희;조종두;조호준
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2006년도 춘계학술대회논문집
    • /
    • pp.62-65
    • /
    • 2006
  • In this paper, nonlinear nonplanar vibration of a flexible rectangular cantilever beam is analyzed when one-to-one resonance occurs to the beam. The planar and nonplanar motions of the beam are analyzed in time and frequency domains. In frequency domain, FFT analyzer is used to perform autospectrum and cepstrum analyses for nonlinear response of the beam. In time domain, an oscilloscope is used to investigate the phase difference between the planar and nonplanar motions and to perform Torus analysis in the phase space. Through those analyzing process, the main frequencies of superharmonic, subharmonic, and super-subharmonic motions are investigated in the nonplanar motion due to one-to-one resonance. Analyzing the phase difference between the planar and nonplanar motions, it is observed that the phase difference varies in time.

  • PDF

스팩트럼과 스팩트로그램의 이해 (Introduction to the Spectrum and Spectrogram)

  • 진성민
    • 대한후두음성언어의학회지
    • /
    • 제19권2호
    • /
    • pp.101-106
    • /
    • 2008
  • The speech signal has been put into a form suitable for storage and analysis by computer, several different operation can be performed. Filtering, sampling and quantization are the basic operation in digiting a speech signal. The waveform can be displayed, measured and even edited, and spectra can be computed using methods such as the Fast Fourier Transform (FFT), Linear predictive Coding (LPC), Cepstrum and filtering. The digitized signal also can be used to generate spectrograms. The spectrograph provide major advantages to the study of speech. So, author introduces the basic techniques for the acoustic recording, digital signal processing and the principles of spectrum and spectrogram.

  • PDF

피치 검출을 위한 스펙트럼 평탄화 기법 (Flattening Techniques for Pitch Detection)

  • 김종국;조왕래;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.381-384
    • /
    • 2002
  • In speech signal processing, it Is very important to detect the pitch exactly in speech recognition, synthesis and analysis. but, it is very difficult to pitch detection from speech signal because of formant and transition amplitude affect. therefore, in this paper, we proposed a pitch detection using the spectrum flattening techniques. Spectrum flattening is to eliminate the formant and transition amplitude affect. In time domain, positive center clipping is process in order to emphasize pitch period with a glottal component of removed vocal tract characteristic. And rough formant envelope is computed through peak-fitting spectrum of original speech signal in frequency domain. As a results, well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. After all, we obtain residual signal which is removed vocal tract element The performance was compared with LPC and Cepstrum, ACF 0wing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

평탄화된 여기 스펙트럼에서 켑스트럼 피치 변경법에 관한 연구 (On a Pitch Alteration Technique by Cepstrum Analysis of Flatten Excitation Spectrum)

  • 조왕래;함명규;배명진
    • 한국음향학회지
    • /
    • 제17권8호
    • /
    • pp.82-87
    • /
    • 1998
  • 음성합성은 합성방식에 따라 파형부호화법, 신호원부호화법, 혼성부호화법으로 분류 할 수 있다. 특히 고음질 합성을 위해서는 파형부호화를 이용한 합성방식이 적합하다. 그렇 지만, 파형부호화를 이용한 합성법은 여기 성분과 여파기 성분을 분리하지 않고 처리하기 때문에 음절단위나 음소단위의 합성기법으로는 바람직하지 못하다. 따라서 파형부호화법을 규칙에 의한 합성에 적용되도록 음원피치를 변경시키기 위한 피치 변경법이 필요하게 된다. 본 논문에서는 스펙트럼 왜곡을 최소화하기 위해 켑스트럼의 성질을 이용하여 피치를 변경 하는 방법에 대하여 제안하였다. 이 방법은 주파수영역상에서 여기 스펙트럼과 여파기 스펙 트럼을 분리하여 여기 스펙트럼을 여기 켑스트럼으로 변환한 후 영값 삽입이나 삭제에 의해 피치를 변경하고 스펙트럼영역에서 피치 변경된 스펙트럼을 재구성하는 기법을 적용하였다. 제안한 방법의 성능을 평가하기 위해 스펙트럼 왜곡율을 측정하여 본 결과 평균 스펙트럼 왜곡율은 평균 2.29%이하로 유지되었으며 주관적인 음질도 평균 3.74로 우수하였다.

  • PDF

신경회로망과 벡터양자화에 의한 사후확률과 확률 밀도함수 추정 및 검증 (Verification and estimation of a posterior probability and probability density function using vector quantization and neural network)

  • 고희석;김현덕;이광석
    • 대한전기학회논문지
    • /
    • 제45권2호
    • /
    • pp.325-328
    • /
    • 1996
  • In this paper, we proposed an estimation method of a posterior probability and PDF(Probability density function) using a feed forward neural network and code books of VQ(vector quantization). In this study, We estimates a posterior probability and probability density function, which compose a new parameter with well-known Mel cepstrum and verificate the performance for the five vowels taking from syllables by NN(neural network) and PNN(probabilistic neural network). In case of new parameter, showed the best result by probabilistic neural network and recognition rates are average 83.02%.

  • PDF

HMM기반 소음분석에 의한 엔진고장 진단기법 (Engine Fault Diagnosis Using Sound Source Analysis Based on Hidden Markov Model)

  • 레찬수;이종수
    • 한국통신학회논문지
    • /
    • 제39A권5호
    • /
    • pp.244-250
    • /
    • 2014
  • The Most Serious Engine Faults Are Those That Occur Within The Engine. Traditional Engine Fault Diagnosis Is Highly Dependent On The Engineer'S Technical Skills And Has A High Failure Rate. Neural Networks And Support Vector Machine Were Proposed For Use In A Diagnosis Model. In This Paper, Noisy Sound From Faulty Engines Was Represented By The Mel Frequency Cepstrum Coefficients, Zero Crossing Rate, Mean Square And Fundamental Frequency Features, Are Used In The Hidden Markov Model For Diagnosis. Our Experimental Results Indicate That The Proposed Method Performs The Diagnosis With A High Accuracy Rate Of About 98% For All Eight Fault Types.

명료발화와 보통발화에서 파킨슨병환자 음성의 켑스트럼 및 스펙트럼 분석 (Characteristics of voice quality on clear versus casual speech in individuals with Parkinson's disease)

  • 신희백;심희정;정훈;고도흥
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.77-84
    • /
    • 2018
  • The purpose of this study is to examine the acoustic characteristics of Parkinsonian speech, with respect to different utterance conditions, by employing acoustic/auditory-perceptual analysis. The subjects of the study were 15 patients (M=7, F=8) with Parkinson's disease who were asked to read out sentences under different utterance conditions (clear/casual). The sentences read out by each subject were recorded, and the recorded speech was subjected to cepstrum and spectrum analysis using Analysis of Dysphonia in Speech and Voice (ADSV). Additionally, auditory-perceptual evaluation of the recorded speech was conducted with respect to breathiness and loudness. Results indicate that in the case of clear speech, there was a statistically significant increase in the cepstral peak prominence (CPP), and a decrease in the L/H ratio SD (ratio of low to high frequency spectral energy SD) and CPP F0 SD values. In the auditory-perceptual evaluation, a decrease in breathiness and an increase in loudness were noted. Furthermore, CPP was found to be highly correlated to breathiness and loudness. This provides objective evidence of the immediate usefulness of clear speech intervention in improving the voice quality of Parkinsonian speech.

화자확인에서 특징벡터의 순시 정보와 선형 변환의 효과적인 적용 (Effective Combination of Temporal Information and Linear Transformation of Feature Vector in Speaker Verification)

  • 서창우;조미화;임영환;전성채
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.127-132
    • /
    • 2009
  • The feature vectors which are used in conventional speaker recognition (SR) systems may have many correlations between their neighbors. To improve the performance of the SR, many researchers adopted linear transformation method like principal component analysis (PCA). In general, the linear transformation of the feature vectors is based on concatenated form of the static features and their dynamic features. However, the linear transformation which based on both the static features and their dynamic features is more complex than that based on the static features alone due to the high order of the features. To overcome these problems, we propose an efficient method that applies linear transformation and temporal information of the features to reduce complexity and improve the performance in speaker verification (SV). The proposed method first performs a linear transformation by PCA coefficients. The delta parameters for temporal information are then obtained from the transformed features. The proposed method only requires 1/4 in the size of the covariance matrix compared with adding the static and their dynamic features for PCA coefficients. Also, the delta parameters are extracted from the linearly transformed features after the reduction of dimension in the static features. Compared with the PCA and conventional methods in terms of equal error rate (EER) in SV, the proposed method shows better performance while requiring less storage space and complexity.

  • PDF