• Title/Summary/Keyword: Speech spectrum

Search Result 309, Processing Time 0.023 seconds

Speech Detection using Speech Spectrum Clustering (음성스펙트럼의 클러스터링을 이용한 음성검출기법 개선)

  • 김태영;김남수;김태정
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.149-152
    • /
    • 2000
  • 본 연구에서는 기존의 통계 이론에 근거한 음성 검출 기법을 제안하는 음성 스펙트럼 모형화기법을 통해 개선시키고자 한다 기존의 방법과는 달리 음성을 하나의 단일 모형이 아닌 여러 클래스(class) 모형의 결합체로 간주한다. 각 클래스 모형의 추정을 위해 신호원 부호화(source coding)의 클러스터링(clustering)과 유사한 기법을 제안하고, 이를 이용한 두 가지의 검출 기법을 제안한다. 하나는 각각의 클래스에 대해 LRT(likelihood ratio test)를 수행하고, 이를 최종적으로 통합하는 기법이고 다른 하나는 각 클래스의 모형으로부터 혼합모형(mixture model)을 구하여 이를 이용하여 LRT를 수행하는 방법이다. 제안한 두 가지 방법 모두 비교적 적은 연산량 증가에도 불구하고 실험 결과 기존 방법에 비해 매우 우수한 성능을 보였다.

  • PDF

A Study on Estimation of Formant and Articulatory Motion using RLSL Adaptive Linear Prediction Filter (RLSL 적응선형예측필터를 이용한 형성음 및 조음운동 궤적 추정에 관한 연구)

  • Kim, Dong-Jun;Song, Young-Soo;Yoon, Tae-Sung;Park, Sang-Hui
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1992 no.05
    • /
    • pp.163-166
    • /
    • 1992
  • In this study, the extractions of formant and articulately motion trajectorles from Korean diphthongs are performed by using the RISL adaptive linear prediction filter. This enables us to extract spectrum transition of speech signal accurately. This study showes that the RISL algorithm is superior to the Levinson algorithm, specially in transition part of speech.

  • PDF

Individual differences in autistic traits and variability in production patterns: a case of affricates by young Seoul Korean speakers

  • Kang, Soyoung;Kong, Eun Jong;Seo, Misun
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.125-131
    • /
    • 2015
  • The current study explores whether speaker variability in the fronted articulations of Seoul Korean affricates can be explained by cognitive differences measured by individual autistic traits. The goal was to explore Yu's (2010; 2013) proposal that individual differences in cognitive style can be an important factor in speakers' use of sound variants. The spectral peak frequencies (SPF) of affricates relative to those of fricatives, reported in Kong et al. (2014), were used to acoustically represent the relative degree of anterior place of constriction. When these individual SPFs were related to the scores of Autistic-Spectrum Quotients (Baron-Cohen et al., 2001), a correlation was found for the male speakers, but not for the female speakers, such that speakers of more anterior affricate productions scored low in AQs. Discussion is made with respect to how these findings are in line with Yu's proposal.

A Study on Human Evaluators Using the Evaluation Model of English Pronunciation (영어 발음 평가 모델을 활용한 수동 평가자 연구)

  • Yoon, Kyuchul
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.109-119
    • /
    • 2013
  • The purpose of this paper is to show the tendency of evaluators in the pronunciation evaluation of English utterances. The tendency was visualized using the evaluation model of English pronunciation proposed in [1]. One hundred fifty female university students and four evaluators participated in the study. Students read eight English sentences aloud as evaluators evaluated English pronunciation by their own criteria. The models based on their pronunciation evaluation proved to be efficient in showing their evaluation tendency in terms of the fundamental frequency, intensity, segmental durations, and segmental spectra as compared to those of the five native speakers of English chosen for building the models. However, human evaluators were not always consistent in their evaluation and sometimes gave conflicting scores to the same students.

A new method of Extracting the Filter Characteristics of the Nasal Cavity Using Homorganic Nasal-Stop Sequences: A Preliminary Report (동기관음의 스펙트럼 차이를 이용한 비강 특성 산출: 예비 연구)

  • Park, Han-Sang
    • MALSORI
    • /
    • no.53
    • /
    • pp.17-35
    • /
    • 2005
  • A New Method of Extracting the Filter Characteristics of the Nasal Cavity Using Homorganic Nasal-Stop Sequences: A Preliminary R eportHansang ParkThis study provides a new method of extracting the filter characteristics of the nasal cavity. Korean lenis stops are realized as voiced in the homorganic nasal-lenis stop sequences between vowels. Since the only difference between the two members of the homorganic nasal- lenis stop sequences, such as [mb], [nd], and [ g], is whether the passage to the nasal cavity is open or not, the subtraction of the LPC spectrum of the voiced stop from that of the preceding nasal leads to the filter characteristics of the nasal cavity of an individual speaker regardless of place of articulation. The results suggest that various attempts should be made to extract a robust filter characteristics of the nasal cavity by giving variation to LPC coefficients and by paying particular attention to speech samples. This study is significant in that it provides a preliminary report about a new method of extracting the filter characteristics of the nasal cavity.

  • PDF

The Computation Reduction Algorithm Independent of the Language for CELP Vocoders (각국 언어 특성에 독립적인 CELP 계열 보코더에서의 계산량 단축 알고리즘)

  • 민소연;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2451-2454
    • /
    • 2003
  • In this paper, we propose the computation reduction methods of LSP(Line spectrum pairs) transformation that is mainly used in CELP vocoders. In order to decrease the computational time in real root method the characteristic of four proposed algorithms is as the following. First, scheme to reduce the LSP transformation time uses met scale. Developed the second scheme is the control of searching order by the distribution characteristic of LSP parameters. Third, scheme to reduce the LSP transformation time uses voice characteristics. Developed the fourth scheme is the control of searching interval and order by the distribution characteristic of LSP parameters. As a result of searching time, computational amount, transformed LSP parameters, SNR, MOS test, waveform of synthesized speech, speech, spectrogram analysis, searching time reduced about 37.5%, 46.21%, 46.3%, 51.29% in average, computational amount is reduced about 44.76%, 49.44%, 47.03%, 57.40%. But the transformed LSP parameters of the proposed methods were the same as those of real root method.

  • PDF

Encoding of Speech Spectral Parameters Using Adaptive Vector-Scalar Quantization Methods for Mobile Communication Systems

  • Lee, In-Sung;Kim, Jong-Hark
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4E
    • /
    • pp.35-40
    • /
    • 1998
  • In this paper, an efficient quantization method of line spectrum pairs(LSP) with cascaded structure of vector quantizer and scalar quantizer is proposed. First, input LSP parameters is vector-quantized using a codebook a with a moderate number of entries. In the second stage of quantization, the components of residual vector are individually quantized by the scalar quantizer. The utilization of ordering property of LSP parameters and the inclusion of interframe prediction improve the quantizer performance and remove the stability check routine after quantization procedure. The new vector-scalar hybrid quantizer using 26 bits/frame shows a transparent quality of speech that an average spectral distortion is 1 dB and the frame proportion with above 2 dB spectral distortion is less than 2%. The performances of proposed quantization method is evaluated in the transmission errors.

  • PDF

Performance Improvement of Sound Direction of Arrival Estimation by Applying Threshold to CPSP (CPSP 문턱값 설정을 통한 음원도달 방향 추정 성능 개선)

  • Quan, Xingri;Bae, Keun-Sung
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.109-114
    • /
    • 2011
  • To estimate sound direction of arrival with a pair of microphones, a method based on Time Difference of Arrival (TDOA) estimation using the Cross Power Spectrum Phase (CPSP) function is largely used due to its simplicity and good performance. In this paper, we investigate CPSP maximum values for various SNRs and adverse environments, and propose a novel method to improve the estimation performance of sound direction of arrival. The proposed method applies a threshold to the CPSP values and increases the reliability of the estimated sound direction. Through computer simulation for various SNRs, we validate the effectiveness of the proposed method. When the threshold was set to 0.1, more than 90% of success rate of sound direction of arrival estimation has been achieved for directions of $10^{\circ}$, $40^{\circ}$, $70^{\circ}$ from the source location even with reverberation times of 0.1s.

  • PDF

A Comparative Analysis on English Vowels of Korean Students by Formant Frequencies (포먼트에 의한 영어모음 비교 분석)

  • Hwang, Young-Soon
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.221-228
    • /
    • 2001
  • The purpose of this study is to analyze the problems Korean students, having acoustic structure of Korean vowels, have when they pronounce English vowels by measuring formant frequencies. The experimental results show that the pronunciation of English vowels by Korean students is partially influenced by their Korean vowels. There is little distinction between /i/ and /I/, /U/ and /u/ due to the absence of short and long vowels in Korean pronunciation. Also, as observed in typical Korean vowel pronunciation, there is little difference between the F1 values of /$\varepsilon$/ and /$\{\ae}$/ by Korean speakers, resulting in inaccurate English pronunciation. In addition, compared to English native speakers, Korean speakers show the biggest difference in F1 value of /c/. The fact that they make pronunciation of /c/ covering /e/, /$\Lambda$/ and /c/ positions probably accounts for such phenomenon. The results of this experiment show the interference of Korean that occurred in some English vowels by native Korean speakers.

  • PDF

A Study of the SPR (Singing Power Ratio) on the Singing Voice in Singing Students (성악 전공 학생의 가칭 시 음성의 SPR(Singing Power Ratio)에 관한 연구)

  • Jo, Sung-Mi;Jeong, Ok-Ran;Lee, Sang-Ouk
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.121-127
    • /
    • 2004
  • This study attempted to provide a spectrum analysis for quantitative evaluation of singing voice quality of singing students rather than the presence or absence of the singer's formant. The regression analysis was used to analyse the relationship between ringing quality, SPR, and SPP of singing voice of college student subjects majoring in music. This study measured singing. power ratio (SPR) in 41 singing students. Digital audio recordings were made in sung vowels for acoustic analyses. Each sample was judged by 1 experienced singing teacher and 4 voice pathologists on one semantic bipolar 7-point scales (ringing-dull). The results showed that the SPR and SPP had significant correlations with ringing quality. The SPR had a significant relationship with ringing quality on singing voice in singing students. The SPR can be an important quantitative measurement for evaluating singing voice quality.

  • PDF