• 제목/요약/키워드: Speech analysis

검색결과 1,592건 처리시간 0.024초

동시발화에 나타나는 발화 속도 변이 분석 (Speech Rate Variation in Synchronous Speech)

  • 김미란;남호성
    • 말소리와 음성과학
    • /
    • 제4권4호
    • /
    • pp.19-27
    • /
    • 2012
  • When two speakers read a text together, the produced speech has been shown to reduce a high degree of variability (e.g., pause duration and placement, and speech rate). This paper provides a quantitative analysis of speech rate variation exhibited in synchronous speech by examining the global and local patterns in two dialects of Mandarin Chinese (Taiwan and Shanghai). We analyzed the speech data in terms of mean speech rate and the reference of "Just Noticeable difference (JND)" within a subject and across subjects. Our findings show that speakers show lower and less variable speech rates when they read a text synchronously than when they read alone. This global pattern is observed consistently across speakers and dialects maintaining the unique local variation patterns of speech rate for each dialect. We conclude that paired speakers lower their speech rates and decrease the variability in order to ensure the synchrony of their speech.

음성 하모닉스 스펙트럼의 피크-피팅을 이용한 피치검출에 관한 연구 (A Study on the Pitch Detection of Speech Harmonics by the Peak-Fitting)

  • 김종국;조왕래;배명진
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.85-95
    • /
    • 2003
  • In speech signal processing, it is very important to detect the pitch exactly in speech recognition, synthesis and analysis. If we exactly pitch detect in speech signal, in the analysis, we can use the pitch to obtain properly the vocal tract parameter. It can be used to easily change or to maintain the naturalness and intelligibility of quality in speech synthesis and to eliminate the personality for speaker-independence in speech recognition. In this paper, we proposed a new pitch detection algorithm. First, positive center clipping is process by using the incline of speech in order to emphasize pitch period with a glottal component of removed vocal tract characteristic in time domain. And rough formant envelope is computed through peak-fitting spectrum of original speech signal infrequence domain. Using the roughed formant envelope, obtain the smoothed formant envelope through calculate the linear interpolation. As well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. Inverse fast fourier transform (IFFT) compute this flattened harmonics. After all, we obtain Residual signal which is removed vocal tract element. The performance was compared with LPC and Cepstrum, ACF. Owing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

음원신호 추출을 위한 주파수영역 응용모델에 기초한 독립성분분석 (Independent Component Analysis Based on Frequency Domain Approach Model for Speech Source Signal Extraction)

  • 최재승
    • 한국전자통신학회논문지
    • /
    • 제15권5호
    • /
    • pp.807-812
    • /
    • 2020
  • 본 논문은 여러 음원신호가 혼합된 환경에서 목적으로 하는 음원신호만을 분리하기 위하여 마이크로폰을 사용한 블라인드 음원분리 알고리즘을 제안한다. 제안하는 알고리즘은 독립성분분석 방법을 기반으로 한 주파수영역 표현모델이다. 따라서 2 음원에 대한 주파수영역 독립성분분석의 실제 환경에서의 유효성 검증을 목적으로, 음원의 종류를 변경하여 주파수영역 독립성분분석을 실행하여 음원분리를 실시하여 그 향상효과를 검증한다. 파형에 의한 실험결과로부터 원래의 파형과 비교하여 2채널의 음원신호를 깨끗하게 분리할 수 있음을 명확히 하였다. 또한 목표 신호 대 간섭 에너지비율을 사용하여 비교한 실험 결과로부터 본 논문에서 제안한 알고리즘의 음원분리 성능이 기존의 알고리즘에 비하여 성능이 향상되었다는 것을 알 수 있었다.

음성인식기를 이용한 발음오류 자동분류 결과 분석 (Performance Analysis of Automatic Mispronunciation Detection Using Speech Recognizer)

  • 강효원;이상필;배민영;이재강;권철홍
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.29-32
    • /
    • 2003
  • This paper proposes an automatic pronunciation correction system which provides users with correction guidelines for each pronunciation error. For this purpose, we develop an HMM speech recognizer which automatically classifies pronunciation errors when Korean speaks foreign language. And, we collect speech database of native and nonnative speakers using phonetically balanced word lists. We perform analysis of mispronunciation types from the experiment of automatic mispronunciation detection using speech recognizer.

  • PDF

목소리 특성과 음성 특징 파라미터의 상관관계와 SVM을 이용한 특성 분류 모델링 (Correlation analysis of voice characteristics and speech feature parameters, and classification modeling using SVM algorithm)

  • 박태성;권철홍
    • 말소리와 음성과학
    • /
    • 제9권4호
    • /
    • pp.91-97
    • /
    • 2017
  • This study categorizes several voice characteristics by subjective listening assessment, and investigates correlation between voice characteristics and speech feature parameters. A model was developed to classify voice characteristics into the defined categories using SVM algorithm. To do this, we extracted various speech feature parameters from speech database for men in their 20s, and derived statistically significant parameters correlated with voice characteristics through ANOVA analysis. Then, these derived parameters were applied to the proposed SVM model. The experimental results showed that it is possible to obtain some speech feature parameters significantly correlated with the voice characteristics, and that the proposed model achieves the classification accuracies of 88.5% on average.

음성 부호기용 채널 부호화기의 구현 및 성능 분석 (Channel Coder Implementation and Performance Analysis for Speech Coding: Considering bit Importance of Speech Information-part III)

  • 강법주;김선영;김상천;김영식
    • 대한전자공학회논문지
    • /
    • 제27권4호
    • /
    • pp.484-490
    • /
    • 1990
  • In speech coding scheme, because information bits have different error sensitivities over channel errors, the channel coder for combining with speech coding should be realized by the variable coding rate considering the bit importance of speech information bits. In realizing the 4 kbps channel coder for 12kbps speech, this paper have chosen the channel coding method by analyzing the hard-decision post-decoding error rate of RCPC(Rate Compatible Punctured Convolutional) codes and bit error sensitivity of 12 kbps speech. Under the coherent QPSK and Rayleigh fading channel, the performance analysis has showed that 10dB gain was obtained in speech SEGSNR by 4-level uneuqal error protection, which was compared with the caseof no channel coding at 7dB channel SNR.

  • PDF

Simulink를 이용한 음원모델 시뮬레이터 구현 (Implementation of Voice Source Simulator Using Simulink)

  • 조철우;김재희
    • 말소리와 음성과학
    • /
    • 제3권2호
    • /
    • pp.89-96
    • /
    • 2011
  • In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.

  • PDF

Analysis of Speech Signals Depending on the Microphone and Micorphone Distance

  • Son, Jong-Mok
    • The Journal of the Acoustical Society of Korea
    • /
    • 제17권4E호
    • /
    • pp.41-47
    • /
    • 1998
  • Microphone is the first link in the speech recognition system. Depending on its type and mounting position, the microphone can significantly distort the spectrum and affect the performance of the speech recognition system. In this paper, characteristics of the speech signal for different microphones and microphone distances are investigated both in time and frequency domains. In the time domain analysis, the average signal-to-noise ration is measure ration is measured for the database we collected depending on the microphones and microphone distances. Mel-frequency spectral coefficients and mel-frequency cepstrum are computed to examine the spectral characteristics. Analysis results are discussed with our findings, and the result of recognition experiments is given.

  • PDF

문단낭독 시 속삭임 발화와 정상 발화의 공기역학적 특성 (Aerodynamic Characteristics of Whispered and Normal Speech during Reading Paragraph Tasks)

  • 표화영
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.57-62
    • /
    • 2014
  • The present study was performed to investigate and discuss the aerodynamic characteristics of whispered and normal speech during reading paragraph tasks. 39 normal females(18-23 yrs.) read 'Autumn' paragraph with whispered and normal phonation. Their readings were recorded and analyzed by 'Running Speech' in Phonatory Aerodynamic System(PAS) instrument. As results, during whispered speech, the total duration was longer and the numbers of inspiration were more frequently shown than normal speech. The Peak expiratory and inspiratory rate were higher in normal speech, but the expiratory and inspiratory volume were higher in whispered speech. By correlation analysis, both whispered and normal speech showed significantly high correlation between total duration and expiratory/inspiratory airflow duration; numbers of inspiration and inspiratory airflow duration; expiratory and inspiratory volume. These results show that whispered speech needs more respiratory effort but shows poorer aerodynamic efficacy during phonation than normal speech.

음성 신호의 다구간 에너지 차를 이용한 새로운 프리엠퍼시스 방법에 관한 연구 (A Study on a New Pre-emphasis Method Using the Short-Term Energy Difference of Speech Signal)

  • 김동준;김주리
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제50권12호
    • /
    • pp.590-596
    • /
    • 2001
  • The pre-emphasis is an essential process for speech signal processing. Widely used two methods are the typical method using a fixed value near unity and te optimal method using the autocorrelation ratio of the signal. This study proposes a new pre-emphasis method using the short-term energy difference of speech signal, which can effectively compensate the glottal source characteristics and lip radiation characteristics. Using the proposed pre-emphasis, speech analysis, such as spectrum estimation, formant detection, is performed and the results are compared with those of the conventional two pre-emphasis methods. The speech analysis with 5 single vowels showed that the proposed method enhanced the spectral shapes and gave nearly constant formant frequencies and could escape the overlapping of adjacent two formants. comparison with FFT spectra had verified the above results and showed the accuracy of the proposed method. The computational complexity of the proposed method reduced to about 50% of the optimal method.

  • PDF