• Title/Summary/Keyword: speech waveform

Search Result 135, Processing Time 0.018 seconds

Real-time Implementation of Variable Transmission Bit Rate Vocoder Integrating G.729A Vocoder and Reduction of the Computational Amount SOLA-B Algorithm Using the TMS320C5416 (TMS320C5416을 이용한 G.729A 보코더와 계산량 감소된 SOLA-B 알고리즘을 통합한 가변 전송율 보코더의 실시간 구현)

  • 함명규;배명진
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.84-89
    • /
    • 2003
  • In this paper, we real-time implemented to the TMS320C5416 the vocoder of variable bit rate applied the SOLA-B algorithm by Henja to the ITU-T G.729A vocoder of 8kbps transmission rate. This proposed method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. At this time, we bandied that the interval of cross correlation function if skipped every 3 sample for decreasing the computational amount of SOLA-B algorithm. The real-time implemented vocoder of C.729A and SOLA-B algorithm is represented the complexity of maximum that is 10.2MIPS in encoder and 2.8MIPS in decoder of 8kbps transmission rate. Also, it is represented the complexity of maximum that is 18.5MIPS in encoder and 13.1MIPS in decoder of 6kbps, it is 18.5MIPS in encoder and 13.1MIPS in decoder of 4kbps. The used memory is about program ROM 9.7kwords, table ROM 4.5kwords, RAM 5.1 kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, for evaluation of speech quality of the vocoder of real-time implemented variable bit rate, it is estimated the MOS score of 3.69 in 4kbps.

A study on the correlation between Sound Characteristic and Sasang Constitution by Laryngograph, EGG (Laryngograph와 EGG를 이용한 음향특성(音響特性)과 사상체질간(四象體質間)의 상관성(相關性) 연구(硏究))

  • Kim, Sun-hyung;Shin, Mi-ran;Kim, Dal-rae;Kwon, Ki-rok
    • Journal of Sasang Constitutional Medicine
    • /
    • v.12 no.1
    • /
    • pp.144-156
    • /
    • 2000
  • Purpose of this study is to help classifying Sasang Constitution through correlation with Larynx waveform. This study was done it under the suppose that Sasang Constitution would be correlation with Larynx waveform. The following result were obtained about correlation between Erectroglottograph waveform and Sasang Constitution by analysis EGG program. 1. Taeumin was lower than Soyangin in Open Std Deviation, Contact Std Deviation of male/a/(0.5sec) 2. Soeyangin was high compared with the others in Pitch range of maie/a/(2.5sec) 3. Taeumin was higher than Soeumin in Pitch range, Soeyangin in pitch Maximum, and the others in Pitch Std Deviation of female/e/(0.5sec) 4. Taeumin was higher than Soeumin in Contact Maximum and lower than Soeumin in Contact Maximum of female/a/(2.5sec) 5. There was no significantly difference in male/e/(0.5sec), male/e/(2.5sce), female/a/(0.5sec), female/e/(2.5sec) 6. The percent of correctly classified in Soeoumin and Taeumin was high in CART Algolism. The risk estimate of Soyangin was relatively high. The study may be use on of the method to make objective diagnosis in Sasang constitution.

  • PDF

A STUDY ON THE INFLUENCE OF THE PALATAL PLATES UPON THE DURATION OF KOREAN SOUNDS (구개상 장착에 따른 한국어 어음의 조음시간 변화에 관한 연구)

  • Koh, Yeo-Joon;Kim, Chang-Whe;Kim, Yong-Soo
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.32 no.1
    • /
    • pp.77-102
    • /
    • 1994
  • Many studies have been made on the masticatory and esthetic effects of prosthodontic treatments, but few on the restoration of pronunciation, especially in complete denture wearers. The purpose of this study is to provide a basis that could be of help to the complete denture wearers' speech adaptation by analyzing the influence of the palatal coverage upon the duration of consonants and vowels with the method of experimental phonetics. For this study, metal plates and resin plates were made for 3 male subjects in their twenties, who have good occlusion, and do not have speech and hearing disorders. Then 8 Korean consonants and 4 Korean vowels were selected, systemically considering phonetic variants such as the place and manner of articulation, lenis/fortis, mutual effect of each phoneme, etc. They were combined into meaningless tested words in the form of /VCV/, and were included in the carrier sentences. Each informant uttered the sentences 1) without the plate, 2) with the metal plate, 3) with the resin plate. The recorded data were analyzed through the waveform of sounds and spectrogram by using the program SoundEdit, Signalize, Statview 512+for the Macintosh computer. The duration of each segment was measured by searching for the boundaries between the preceding vowels and consonants, and between the consonants and the following vowels. The study led to the conclusion that. 1. With the palatal plate, the duration of all the tested words increased and the duration increased more with the resin plate than with the metal plate. 2. With the palatal plate, the duration of all the preceding vowels, consonants, and following vowels increased, but the temporal structure of the tested words was maintained. 3. As for the manner of articulation, fricative /s/(ㅅ) was greatly influenced by both kinds of palatal plates. 4. As for the place of articulation, alveolar sounds /d/(ㄷ), /n/(ㄴ) were greatly influnced by the kinds of palatal plates, and the velar sounds /n/(ㅇ), /g/(ㄱ) were influenced by the platal plates, but the kind of the palatal plates did not show any significance. 5. As for the lenis/fortis, lenis was influenced more by the kind of the palatal plates. 6. As for the influence of vowels upon each segment in the tested words, palatal vowel /i/(ㅣ) had greater influence than pharyngeal vowel /a/(ㅏ), and following vowels than preceding vowels.

  • PDF

Analysis of Phonatory Aerodynamic & Electroglottography of a Countertenor (Countertenor 1인의 Modal Register와 Falsetto Register에서의 공기역학적 변화 및 전기성문파형의 변화 연구)

  • Nam, Do-Hyun;Choi, Seong-Hee;Choi, Jae-Nam;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.1
    • /
    • pp.43-48
    • /
    • 2006
  • Background and Objectives: Countertenors who can produce higher vocal pitch like female classical singer's voice and use both modal and falsetto register. This study was conducted to study phonatory characteristics between modal and falsetto register of the countertenor. Materials and Methods: A male countertenor who had 8 years of experience was examined using a videostroboscopy and his voice was analyzed using aerodynamic measures; fundamental frequency(F0), Mean air flow rate(MFR), intensity(SLP), subglottal air pressure(Psub) with phonatory function analyzer(Nagashima) and acoustic measures; jitter, shimmer, HNR, closed quotient(CQ) using a Electro-glottography(EGG) of Lx. Speech Studio(Laryngoscope, Ltd, UK) and voice range profile of CSL(Kay elemetrics). Results: In the stroboscopy finding, the longitudinal length of vocal folds was increased at the falsetto register and the upper margin of vocal folds vibrated with incomplete closure of true vocal folds. In aerodynamic analysis, intensity was same at the modal and falsetto register. However, MFR, Psub, MPT were higher at the falsetto register. In the electroglottographic analysis, closed quotient(CQ) at the modal register was high and also much higher at the high-pitch falsetto than at the loud falsetto. In the VRP, intensity was similar though F0 was different between modal and falsetto register. Conclusion: It implied that countertenor could produce powerful voice quality by increasing of respiratory pressure and respiratory volume though glottal closure was incomplete. In addition, no change of EGG waveform, similar voice range with alto was observed.

  • PDF

A Study about the Users's Preferred Playing Speeds on Categorized Video Content using WSOLA method (WSOLA를 이용한 동영상 미세배속 재생 서비스에 대한 콘텐츠별 배속 선호도 분석 연구)

  • Kim, I-Gil
    • Journal of Digital Contents Society
    • /
    • v.16 no.2
    • /
    • pp.291-298
    • /
    • 2015
  • In a fast-paced information technology environment, consumption of video content is changing from one-way television viewing to VOD (Video on Demand) playing anywhere, anytime, on any device. This video-watching trend gives additional importance to videos with fine-speed-control, in addition to the strength of the digital video signal. Currently, many video players provide a fine-speed-control function which can speed up the video to skip a boring part, or slow it down to focus on an exciting scene. The audio information is just as important as the visual information for understanding the content of the speed-controlled video. Thus, a number of algorithms for fine-speed-control video-playing technologies have been proposed to solve the pitch distortion in the audio-processing area. In this study, well-known techniques for prosodic modification of speech signals, WSOLA (Waveform-Similarity-Based Overlap-Add), have been applied to analyze users' needs for fine-speed-control video playing. By surveying the users' preferred speeds on categorized video content and analyzing the results, this paper proposes that various fine-speed adjustments are needed to accommodate users' preferred video consumption.