• Title/Summary/Keyword: 음향 파라미터

Search Result 387, Processing Time 0.027 seconds

A Variable Parameter Model based on SSMS for an On-line Speech and Character Combined Recognition System (음성 문자 공용인식기를 위한 SSMS 기반 가변 파라미터 모델)

  • 석수영;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.528-538
    • /
    • 2003
  • A SCCRS (Speech and Character Combined Recognition System) is developed for working on mobile devices such as PDA (Personal Digital Assistants). In SCCRS, the feature extraction is separately carried out for speech and for hand-written character, but the recognition is performed in a common engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model), which consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. For generating contort independent variable parameter model, we propose the SSMS(Successive State and Mixture Splitting), which gives appropriate numbers of mixture and of states through splitting in mixture domain and in time domain. The recognition results show that the proposed SSMS method can reduce the total number of GOPDD (Gaussian Output Probability Density Distribution) up to 40.0% compared to the conventional method with fixed parameter model, at the same recognition performance in speech recognition system.

A Study on the Dynamic Feature of Phoneme for Word Recognition (단어인식을 위한 음소의 동적 특징에 관한 검토)

  • 김주곤
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1997.06a
    • /
    • pp.35-39
    • /
    • 1997
  • 본 연구에서는 음소를 인식의 기본단위로 하는 한국어 단어인식 시스템의 인식정도를 개선하기 이해 각 음소의 시간방향의 정보를 포함하고 있는 동적특징인 회귀계수와 K-L(Karhunen-Loeve)변환으로 얻은 특징파라미터(이하 K-L계수라 함)를 이용하여 음소인식과 단어인식 실험을 수행한 결과 그 유효성을 확인하였다. 이를 위해 먼저 파열음을 대상으로 정적 특징과 파라미터인 멜-켑스트럼(Mel-Cepstrum)과 동적 특징 파라미터인 회귀계수(Regressive Coefficient) 와 K-L 계수(Karhunen-Loeve Coefficient)를 추출하여 음소 인식실험을 수행하였다. 그 결과 멜-켑스트럼을 사용한 경우 39.84%, 회귀계수를 사용한 경우 48.52%, K-L계수를 사용한 경우 52.40%의 인식률을 얻었다. 이를 참고로 각각의 특징 파라미터를 결합하여 인식실험한 결과 멜-켑스트럼과 K-L계수를 사용한 경우 47.17%,멜 -켑스트럼과 회귀계수의 경우 60.11%,K-L계수와 회귀계수의 경우 60.35%, 멜-켑스트럼과 K-L계수 , 회귀계수를 사용한 경우 58.13%를 인식률을 얻어 동적특징인 K-L 계수와 회귀계수를 사용한 경우와 멜-켑스트럼과 회귀계수를 사용한 경우가 높은 인식률을 보였으며 이를 단어로 확장하여 인식실험을 수행한 결과 기존의 특징 파라미터를 이용한 경우보다 높은 인식률을 얻어 동적 파라미터의 유효성을 확인하였다

  • PDF

Pattern Recognition for the Target Signal Using Acoustic Scattering Feature Parameter (표적신호 음향산란 특징파라미터를 이용한 패턴인식에 관한 연구)

  • 주재훈;신기철;김재수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.93-100
    • /
    • 2000
  • Target signal feature parameters are very important to classify target by active sonar. Two highly correlated broad band pulses separated by time T have a time separation pitch(TSP) of 1/T Hz which is equal to the trough-to-trough or peak-to-peak spacing of its spectrum. In this study, TSP informations which represent feature of each target signal were effectively extracted by the FFT. The extracted TSP feature parameters were also applied to the pattern recognition algorithm to classify target and to analyze their properties.

  • PDF

Estimation of Speeker Recognition Parameter using Lyapunov Dimension (Lyapunov 차원을 이용한 화자식별 파라미터 추정)

  • Yoo, Byong-Wook;Kim, Chang-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.4
    • /
    • pp.42-48
    • /
    • 1997
  • This paper has apparaised ability of speaker recognition and speech recognition using correlation dimension and Lyapunov dimension. In this method, speech was regarded the cahos that the random signal is appeared in determinisitic raising system. we deduced exact correlation dimension and Lyapunov dimension with searching important orbit from AR model power spectrum when reconstruct strange attractor using Taken's embedding theory. We considered a usefulness of speech recognition and speaker recognition using correlation dimension and Lyapunov dimension that characterized reconstruction attractor. As a result of consideration, which were of use more the speaker recognition than speech recognition, and in case of speaker recognition using Lyapunov dimension were much recognition rate more than speaker recognitions using correlation dimension.

  • PDF

Parameter Generation Algorithm for LSTM-RNN-based Speech Synthesis (LSTM-RNN 기반 음성합성을 위한 파라미터 생성 알고리즘)

  • Park, Sangjun;Hahn, Minsoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.06a
    • /
    • pp.105-106
    • /
    • 2017
  • 본 논문에서는 최대 우도 기반 파라미터 생성 알고리즘을 적용하여 인공 신경망의 출력인 음향 파라미터 열의 정확성 및 자연성을 향상시키는 방법을 제안하였다. 인공 신경망의 출력으로 정적 특징벡터 뿐 만 아니라 동적 특징벡터도 함께 사용하였고, 미리 계산된 파라미터 분산을 파라미터 생성에 사용하였다. 추정된 정적, 동적 특징벡터의 평균, 분산을 EM 알고리즘에 적용하여 최대 우도 기준 파라미터를 추정할 수 있다. 제안된 알고리즘은 파라미터 생성 시 동적 특징벡터 및 분산을 함께 적용하여 시간축에서의 자연성을 향상시켰다. 제안된 알고리즘의 객관적 평가로 MCD, F0 의 RMSE 를 측정하였고, 주관적평가로 선호도 평가를 실시하였다. 그 결과 기존 알고리즘 대비 객관적, 주관적 성능이 향상되는 것을 검증하였다.

  • PDF

Discrimination Between Natural and Artificial Seismic Sounds by Using 20 MSVQ Algorithm (20 MSVQ 알고리즘을 이용한 자연 및 인공 지진음 식별)

  • Yoon, Sang-Hoon;Song, Young-Hwan;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.251-259
    • /
    • 2009
  • This paper proposes an identification technique to discriminate natural and artificial seismic sounds by using the 20 MSVQ algorithm with the data measured by using a hydrophone. Spectrum band energy and MFCC were used as representative parameters for sake of discriminating natural and artificial seismic sounds, and the orders of characterized parameters were determined through experiments. As a result of using 20 MSVQ algorithm with the 2 characterized parameters, MFCC had 99.9% and the spectrum energy parameter had 83.9% percent of success. It was verified that it is extremely accurate when seismic sounds were discriminated by using the method suggested by this paper.

A Study on the Characteristics of the Parameters for the Statistical Analysis of Vibration Signal by Using Bearing Wear Test (베어링 마모시험을 이용한 진동신호의 통계적 파라미터 특성연구)

  • Jun, Oh-Sung;Hwang, Cheol-Ho;Yoon, Byung-Ok;Eun, Hee-Joon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.1
    • /
    • pp.5-12
    • /
    • 1989
  • This paper is concerned with the characteristics on the statistical parameters of vibration signal from bearing with changing its operating conditions as well as the spreading of faults. The rms, Kurtosis, crest factor, probability of exceedance and probability density function have been chose as the statistical parameters. To characterize of each, vibration signals have been recorded from four ball tester at different loads, operation speeds and time. The values of the statistical parameters for each frequency band have been calculated after A/D conversion and digital filtering of the recorded signals. It has been found that unlike rms values the statistical parameters such as Kurtosis etc. are almost unchanging with the change of the operating conditions such as load and speed. This suggests that the statistical parameters may be used for determining the development of faults independent of the operating conditions. In fact, the statistical parameters deviate considerably from their respective normal values when the faults developed under load conditions in the samples, conforming the suggestion.

  • PDF

Application of Non-linear Acoustic Effect for Evaluation of Degradation of 2.25Cr-1Mo Steel (2.25Cr-1Mo 강의 열화도 평가를 위한 비선형 음향효과 응용법)

  • Choi, Y.H.;Jhang, K.Y.;Park, I.K.;Kim, H.M.
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.22 no.2
    • /
    • pp.170-176
    • /
    • 2002
  • Nonlinear acoustic effect has been considered as an effective tool for the evaluation of material degradation. In this paper, the applicability of nonlinear acoustic effect to the evaluation of degraded 2.25Cr-1Mo steel is investigated. Firstly, artificial aging was performed to simulate the microstructural degradation in 2.25Cr-1Mo steel arising from long time exposure at $540^{\circ}C$. Secondly, ultrasonic nonlinear parameter was quantitatively measured by bi-spectrum and power spectrum. Nonlinear acoustic parameter from bi-spectrum was found to be clearly sensitive to the aging time.

Speaker Verification Performance Improvement Using Weighted Residual Cepstrum (가중된 예측 오차 파라미터를 사용한 화자 확인 성능 개선)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.5
    • /
    • pp.48-53
    • /
    • 2001
  • In speaker verification based on LPC analysis the prediction residues are ignored and LPCC(LPC cepstrum) are only used to compose feature vectors. In this study, LPCC and RCEP (residual cepstrum) extracted from residues are used as feature parameters in the various environmental speaker verification. We propose the weighting function which can enlarge inter-speaker variation by weighting pitch, speaker inherent vector, included in residual cepstrum. Simulation results show that the average speaker verification rate is improved in the rate of 6% with RCEP and LPCC at the same time and is improved in the rate of 2.45% with the proposed weighted RCEP and LPCC at the same time compared with no weighting.

  • PDF

Implementation of Demisyllable database for formant synthesizer (포만트 합성기용 반음절 세트의 구축에 관한 연구)

  • 이정석
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1992.06a
    • /
    • pp.81-84
    • /
    • 1992
  • 포만트형 합성기에 사용될 반음절 데이터 베이스의 구성과 필요한 파라미터의 추출 과정에 대하여 논한다. 포만트 합성기는 많은 구동 파라미터를 필요로 하기 때문에 저장 장소를 절약하기 위해서 적절한 합성단위의 선택과 합성단위의 효율적인 표현이 필요하다. 본 연구에서는 포만트 합성기에 있어서 합성음의 음질에 큰 영향을 미치는 포만트궤적의 추출과 데이터베이스의 구성에 대하여 기술한다.

  • PDF