• 제목/요약/키워드: Speech Spectrogram

검색결과 90건 처리시간 0.024초

Neighborhood 관계를 이용한 DUET Generalization (Generalization of DUET using neighborhood relationship)

  • 우성민;정홍
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.1017-1018
    • /
    • 2008
  • In this paper, we propose a method that makes use of neighborhood relationship in 2D spectrogram of separated sources toward the generalization of the binary mask in Degenerate Unmixing Estimation Technique (DUET). A new generalized mask can be consist of five to ten mask. According to the new mask, the original power of the spectrogram in each frequency-time point is assigned. The result showed a smooth and tender wave-form, indicating a high speech separation performance compared to the original method.

  • PDF

주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원 (Target Speaker Speech Restoration via Spectral bases Learning)

  • 박선호;유지호;최승진
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권3호
    • /
    • pp.179-186
    • /
    • 2009
  • 본 논문에서는 학습이 가능한 특정화자의 발화음성이 있는 경우, 잡음과 반향이 있는 실 환경에서의 스테레오 마이크로폰을 이용한 특정화자 음성복원 알고리즘을 제안한다. 이를 위해 반향이 있는 환경에서 음원들을 분리하는 다중경로 암묵음원분리(convolutive blind source separation, CBSS)와 이의 후처리 방법을 결합함으로써, 잡음이 섞인 다중경로 신호로부터 잡음과 반향을 제거하고 특정화자의 음성만을 복원하는 시스템을 제시한다. 즉, 비음수 행렬분해(non-negative matrix factorization, NMF) 방법을 이용하여 특정화자의 학습음성으로부터 주파수 특성을 보존하는 기저벡터들을 학습하고, 이 기저벡터들에 기반 한 두 단계의 후처리 기법들을 제안한다. 먼저 본 시스템의 중간단계인 CBSS가 다중경로 신호를 입력받아 독립음원들을(두 채널) 출력하고, 이 두 채널 중 특정화자의 음성에 보다 가까운 채널을 자동적으로 선택한다(채널선택 단계). 이후 앞서 선택된 채널의 신호에 남아있는 잡음과 다른 방해음원(interference source)을 제거하여 특정화자의 음성만을 복원, 최종적으로 잡음과 반향이 제거된 특정화자의 음성을 복원한다(복원 단계). 이 두 후처리 단계 모두 특정화자 음성으로부터 학습한 기저벡터들을 이용하여 동작하므로 특정화자의 음성이 가지는 고유의 주파수 특성 정보를 효율적으로 음성복원에 이용 할 수 있다. 이로써 본 논문은 CBSS에 음원의 사전정보를 결합하는 방법을 제시하고 기존의 CBSS의 분리 결과를 향상시키는 동시에 특정화자만의 음성을 복원하는 시스템을 제안한다. 실험을 통하여 본 제안 방법이 잡음과 반향 환경에서 특정화자의 음성을 성공적으로 복원함을 확인할 수 있다.

Palatal Cancer환자의 Obturator 장착전후 모음의 음향학적 특성과 말 명료도에 관한 연구 (The Study on the Acoustical Characteristics and Speech Intelligibility of Vowels Produced by the Maxillectomized Patients before and after Obturator-Wearing)

  • 최성희;정문규;김호중;표화영;심현섭;최홍식
    • 대한후두음성언어의학회지
    • /
    • 제10권2호
    • /
    • pp.140-148
    • /
    • 1999
  • The use of obturator is the prosthetic rehabilitation approach for restoration of the defected maxillary shape and function for the patients with palatal defect. The obturator can change the shape of vocal tract and nasality, but few reports on the effects of the change were presented. So, the authors performed the experimental study to compare the difference between the sizes of vowel triangles produced by maxillectomized patients before and after obturator-wearing and to consider how much improvement in speech intelligibility can be expected by obturator wearing. The 8 patients who were totally maxillectomized due to palatal cancer were participated as subjects. They produced 5 vowels(/a/, /i/, /u/, /e/, /o/) before and after obturator-wearing. The formants of the vowels were analyzed by the spectrogram of CSL, and their speech intelligibility were judged by normal 8 listeners. As results, the frequency of the first and the second formant showed no significant difference between the articulation before and after wearing, but the comparison of the sizes of vowel triangles, related with the speech intelligibility, showed significant difference. The vowel triangle of the articulation after wearing was larger than that of the articulation before wearing. /i/ showed the lowest speech intelligibility score among the vowel articulation before wearing. After wearing obturators, their scores increased on the whole, especially, in /a/, but the intelligibility of /u/ decreased after wearing.

  • PDF

비음 측정기, 전기 구개도 및 음성 분석 컴퓨터 시스템을 이용한 구개열 언어 장애의 특성 연구 (The Speech of Cleft Palate Patients using Nasometer, EPG and Computer based Speech Analysis System)

  • 신효근;김오환;김현기
    • 음성과학
    • /
    • 제4권2호
    • /
    • pp.69-89
    • /
    • 1998
  • The aim of this study is to develop an objectively method of speech evaluation for children with cleft palates. To assess velopharyngeal function, Visi-Pitch, Computerized Speech Lab. (CSL), Nasometer and Palatometer were used for this study. Acoustic parameters were measured depending on the diagnostic instruments: Pitch (Hz), sound pressure level (dB), jitter (%) and diadochokinetic rate by Visi-Pitch, VOT and vowels formant ($F_1\;&\;F_2$) by a Spectrography and the degree of hypernasality by Nasometer. In addition, Palatometer was used to find the lingual-palatal patterns of cleft palate. Ten children with cleft palates and fifty normal children participated in the experiment. The results are as follows: (1) Higher nasalance of children with cleft palates showed the resonance disorder. (2) The cleft palate showed palatal misarticulation and lateral misarticulation on the palatogram. (3) Children with cleft palates showed the phonatory and respiratory problems. The duration of sustained vowels in children with cleft palates was shorter than in the control groups. The pitch of children with cleft palates was higher than in the control groups. However, intensity, jitter and diadochokinetic rate of children with cleft palates were lower than in the control group. (4) On the Spectrogram, the VOT of children with cleft palates was longer than control group. $F_1\;&\;F_2$ were lower than in the control group.

  • PDF

실시간 음성타자 시스템 구현 (Development of Realtime Phonetic Typewriter)

  • 조우연;최두일
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1999년도 추계학술대회 논문집 학회본부 B
    • /
    • pp.727-729
    • /
    • 1999
  • We have developed a realtime phonetic typewriter implemented on IBM PC with sound card based on Windows 95. In this system, analyzing of speech signal, learning of neural network, labeling of output neurons and visualizing of recognition results are performed on realtime. The developing environment for speech processing is established by adding various functions, such as editing, saving, loading of speech data and 3-D or gray level displaying of spectrogram. Recognition experimental using Korean phone had a 71.42% for 13 basic consonant and 90.01% for 7 basic vowel accuracy.

  • PDF

Speech Enhancement Using Level Adapted Wavelet Packet with Adaptive Noise Estimation

  • Chang, Sung-Wook;Kwon, Young-Hun;Jung, Sung-Il;Yang, Sung-Il;Lee, Kun-Sang
    • The Journal of the Acoustical Society of Korea
    • /
    • 제22권2E호
    • /
    • pp.87-92
    • /
    • 2003
  • In this paper, a new speech enhancement method using level adapted wavelet packet is presented. First, we propose a level adapted wavelet packet to alleviate a drawback of the conventional node adapted one in noisy environment. Next, we suggest an adaptive noise estimation method at each node on level adapted wavelet packet tree. Then, for more accurate noise component subtraction, we propose a new estimation method of spectral subtraction weight. Finally, we present a modified spectral subtraction method. The proposed method is evaluated on various noise conditions: speech babble noise, F-l6 cockpit noise, factory noise, pink noise, and Volvo car interior noise. For an objective evaluation, the SNR test was performed. Also, spectrogram test and a very simple listening test as a subjective evaluation were performed.

복합음과 대학생이 발음한 모음 포먼트 측정 (Formant Measurements of Complex Waves and Vowels Produced by Students)

  • 양병곤
    • 음성과학
    • /
    • 제15권3호
    • /
    • pp.39-51
    • /
    • 2008
  • Formant measurements are one of the most important factors to objectively test cross-linguistic differences among vowels produced by speakers of any given languages. However, many speech analysis softwares present erroneous estimates and some researchers use them without any verification procedures. The purposes of this paper are to examine formant measurements of complex waves which were synthesized from the average formant values of five Korean vowels using three default methods in Praat and to verify the measured values of the five vowels produced by 20 students using one of the methods. Variances along the time axis are discussed after determining absolute difference sum from the 1/3 vowel duration point. Results show that there were smaller measurement errors by the burg method. Also, greater errors were observed in the sl or lpc methods mostly caused by the inappropriate formant settings. Formant measurement deviations were greater in those vowels produced by the female students than those of the male students, which were mostly attributed to the settings for the vowels /o, u/. Formant settings can best be corrected by changing the number of formants to the number of visible dark bands on the spectrogram. Those results suggest that researchers should check the validity of the estimates from the speech analysis software. Further studies are recommended on the perception test of the original sound with the synthesized sound by the estimated formant values.

  • PDF

한국인 영어 학습자의 설측음 발화의 문제점: 음향음성학적 특성을 중심으로 (Speech Problems of English Laterals by Korean Learners based on the acoustic Characteristics)

  • 김종구;김현기;전병만
    • 음성과학
    • /
    • 제7권3호
    • /
    • pp.127-138
    • /
    • 2000
  • The aim of this paper is to find the speech problems of English Laterals by Korean learners and to contribute to the effective pronunciation education with visualizing the pronunciation. In this paper we analyzed 18 words including lateral sounds which were divided into such as: initial, initial consonant cluster, intervocalic, final consonant cluster, and final. To analyse the words we used High speed speech analysis system. We examined acoustic characteristics of English lateral spectrogram by using voice sustained time(ms), FL1, FL2, FL3. Before we started, we had expected that the result would show us that the mother tongue interfere in the final sounds because we have similar sounds in Korea. The results of our experiments showed that initially, voice sustained time showed many more differences between Korean and native pronunciation. Also, it was seen that Korean pronunciation used the syllable structure of the own mother tongue. For instance, in the case of initial consonant cluster CCVC, Koreans often used CC as a syllable and VC as another. This was due to the mother tongue interference. For this reason in the intervocalic and in the final, we saw the differences between Korean and native. Therefore we have to accept the visualized analysis system in the instruction of pronunciation.

  • PDF

성인 스피치교육 전후 효과에 관한 목소리변화스펙트로그램 비교 연구 (A Study on the Effects of Speech Training for Adults Focusing on the Analysis of Voices Before and After Speech Training)

  • 정은이;이상호
    • 디지털콘텐츠학회 논문지
    • /
    • 제18권6호
    • /
    • pp.1049-1056
    • /
    • 2017
  • 본 연구는 스피치교육의 효과를 측정하는데 있어 화자의 목소리의 변화에 주목하였다. 본 연구에서는 스피치교육을 통해 얻게 되는 실질적 효과 중 목소리의 변화를 보다 가시적이고, 과학적으로 평가하고자 하였다. 연구결과 모든 학습자의 목소리에서 스피치교육 전과는 다른 객관적인 변화를 찾을 수 있었다. 학습자 모두 공명, 음색, 발음의 정확성, 휴지 등 다양한 목소리 요소에서 점진적 기술향상이 이루어졌다. 즉, 스피치교육을 받기 전보다 목소리가 풍부해지고 발음이 정확하고, 휴지를 잘 활용하는 안정화된 결과를 볼 수 있었다. 이 연구결과를 통해 스피치훈련을 통해 목소리의 변화가 나타날 수 있는지 분석하고, 스피치 학습자들이 스피치교육에 적극 임해 스피치실력 향상의 결과를 얻을 수 있을 것으로 기대된다.

The Electropalatographic Evidence of the Korean Flap: An Intervocalic Korean Liquid Sound

  • Ahn, Soo-Woong
    • 음성과학
    • /
    • 제9권3호
    • /
    • pp.155-168
    • /
    • 2002
  • The intervocalic Korean liquid sound has been recognized as a flap in the studies of the Korean language. But there has been very little experimental data corroborating it. The electropalatographic (EPG) experiment was conducted to test this. The subjects were one Korean speaker and one native English speaker who had a pseudopalate and did the EPG experiment at the UCLA phonetics laboratory. The spectrographic evidence of the flaps in both the English t-flap and the Korean liquid flap was also sought. The English and Korean flaps were between mid/low back vowels so that the vowels themselves would not affect palatal contacts of the tongue. The results confirmed that the Korean liquid is realized as a flap in intervocallical position with many similar properties to English flap in both EPG and spectrographic data. The Korean initial liquid sound in borrowed words such as 'rotary' and 'radio' was also a flap. But the Korean liquid in the word-final and geminate positions was a lateral as in words 'dol ' (stone), 'dollo' (with stone), 'nal' (day) and 'nallara' (carry). The intuitive theory of the Korean liquid flap was proved by the EPG and spectrographic data.

  • PDF