• Title/Summary/Keyword: speech quality

Search Result 807, Processing Time 0.022 seconds

On a Reduction of Pitch Searching Time by Separating the Speech Components in the CELP Vocoder (성분분리에 의한 CELP 보코더의 피치 검색시간 단축에 관한 연구)

  • Hyeon, Jin-Il;Byeon, Gyeong-Jin;Han, Gi-Cheon;Kim, Jong-Jae;Yu, Ha-Yeong;Kim, Jae-Seok;Kim, Dae-Sik;Bae, Myeong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.22-29
    • /
    • 1995
  • Code excited Linear Prediction(CELP) vocoder exhibits good performance at data rates below 4.8 kbps. The major drawback of CELP type coders is their large amount of computation. In this paper, we propose a new pitch searching method that preseves the quality of the CELP vodocer reducing computational complexity. The basic idea is that pregrasps preliminary pitches about signal and performs pitch search only about the preliminary pitches. Applying the proposed method to the CELP vocoder, we can reduce complexity about 90% in th pitch search.

  • PDF

Analysis of Phonatory Aerodynamic & Electroglottography of a Countertenor (Countertenor 1인의 Modal Register와 Falsetto Register에서의 공기역학적 변화 및 전기성문파형의 변화 연구)

  • Nam, Do-Hyun;Choi, Seong-Hee;Choi, Jae-Nam;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.1
    • /
    • pp.43-48
    • /
    • 2006
  • Background and Objectives: Countertenors who can produce higher vocal pitch like female classical singer's voice and use both modal and falsetto register. This study was conducted to study phonatory characteristics between modal and falsetto register of the countertenor. Materials and Methods: A male countertenor who had 8 years of experience was examined using a videostroboscopy and his voice was analyzed using aerodynamic measures; fundamental frequency(F0), Mean air flow rate(MFR), intensity(SLP), subglottal air pressure(Psub) with phonatory function analyzer(Nagashima) and acoustic measures; jitter, shimmer, HNR, closed quotient(CQ) using a Electro-glottography(EGG) of Lx. Speech Studio(Laryngoscope, Ltd, UK) and voice range profile of CSL(Kay elemetrics). Results: In the stroboscopy finding, the longitudinal length of vocal folds was increased at the falsetto register and the upper margin of vocal folds vibrated with incomplete closure of true vocal folds. In aerodynamic analysis, intensity was same at the modal and falsetto register. However, MFR, Psub, MPT were higher at the falsetto register. In the electroglottographic analysis, closed quotient(CQ) at the modal register was high and also much higher at the high-pitch falsetto than at the loud falsetto. In the VRP, intensity was similar though F0 was different between modal and falsetto register. Conclusion: It implied that countertenor could produce powerful voice quality by increasing of respiratory pressure and respiratory volume though glottal closure was incomplete. In addition, no change of EGG waveform, similar voice range with alto was observed.

  • PDF

Design and Implementation of A Multi-Point Multimedia Conference System Using IP Grouping (IP 그룹화를 이용한 다자간 멀티미디어 회의시스템의 설계 및 구현)

  • Sung Baek-Kyon;Seong Dong-Su;Lee Keon-Bae;Hyun Don-Whan
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.7
    • /
    • pp.1012-1021
    • /
    • 2005
  • This paper describes the design and implementation of an efficient multi-point multimedia conference system using IP grouping. Existing multi-point multimedia conference systems are difficult for multi-user to perform efficient cooperation due to bandwidth limitation for data transmission of video, audio and documentation. In the case that multi-user uses limited bandwidth, smooth cooperation does not accomplish due to transmission delay for the real-time transmission of image and speech data. A hybrid transfer method which is mixed with distributed and centralized methods is used for smooth cooperation, and the network bandwidth is reduced by forming multi-user conference systems of IP grouping in this paper. Also, adaptive image frame variations are used to solve bottleneck effect according to the number of users. An efficient multi-user conference system is designed to support audio quality.

  • PDF

An Algorithm on Improving a Pitch Searching by Energy Compensation in a Frame for Vocoder (보코더에서 프레임별 에너지 보상에 의한 피치검색 성능 개선에 관한 연구)

  • Baek, Geum-Ran;Min, So-Yeon;Bae, Myung-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.7
    • /
    • pp.3188-3193
    • /
    • 2012
  • It is important to search a pitch for vocoder. The major drawback to vocoders is their large computational requirements in searching a pitch and a codebook. In this paper, a simple method is proposed to improve the pitch searching process in the pitch filter almost without degradation of quality. The period of speech signal is emphasized by using Dual Pulse technique, the same type of autocorrelation method, in pitch search. Sometimes the incorrect pitch can be obtained by halving, doubling and trifling, To solve it, before searching a pitch, we estimate energy rate in a frame and compensate envelop of signal with it. By using the proposed algorithm in pitch search, its required computation are reduced and searching pitch is improved.

A study on the lip shape recognition algorithm using 3-D Model (3차원 모델을 이용한 입모양 인식 알고리즘에 관한 연구)

  • 김동수;남기환;한준희;배철수;나상동
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1998.11a
    • /
    • pp.181-185
    • /
    • 1998
  • Recently, research and developmental direction of communication system is concurrent adopting voice data and face image in speaking to provide more higher recognition rate then in the case of only voice data. Therefore, we present a method of lipreading in speech image sequence by using the 3-D facial shape model. The method use a feature information of the face image such as the opening-level of lip, the movement of jaw, and the projection height of lip. At first, we adjust the 3-D face model to speeching face image sequence. Then, to get a feature information we compute variance quantity from adjusted 3-D shape model of image sequence and use the variance quality of the adjusted 3-D model as recognition parameters. We use the intensity inclination values which obtaining from the variance in 3-D feature points as the separation of recognition units from the sequential image. After then, we use discrete HMM algorithm at recognition process, depending on multiple observation sequence which considers the variance of 3-D feature point fully. As a result of recognition experiment with the 8 Korean vowels and 2 Korean consonants, we have about 80% of recognition rate for the plosives and vowels.

  • PDF

Design of The Loudness Ratings And Talker Echo For ISDN Telephone (ISDN 전화기의 음량 정격 및 송화자 에코설계)

  • Hong, Jin-Woo;Kang, Kyeong-Ok;Kang, Seong-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.2E
    • /
    • pp.32-40
    • /
    • 1994
  • It is the purpose of this paper to describe the methods for establishing loudness ratings and talker echo out of transmission quality of ISDN telephone connected to fully digital network. In order to design the desirable loudness ratings and talker echo for ISDN telephone, the model system of digital speech communication for subjective tests is developed. Using this model system, opinion tests which decide the optimal CODEC input level, the range of overall loudness rating, sidetone masking rating and talker echo are performed. From the results of tests, we decided that the loudness ratings are 6 to 8dB for sending, 0 to 2dB for receiving, and 8 to 12dB for sidetone masking rating. And, the terminal coupling loss of TCLw of at least 40dB is necessary to provide echo-free telephone communications to telophone users when the overall loudness rating of ISDN telephone is normalized to 10dB.

  • PDF

Transcoding Algorithm for SMV and G.729A Vocoders via Direct Parameter Transformation (G.729A와 SMV 음성부호화기를 위한 파라미터 직접 변환 방식의 상호부호화 알고리듬)

  • 장달원;서성호;이선일;유창동
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.71-83
    • /
    • 2003
  • In this paper, a novel transcoding algorithm for the G.729A and the Selectable Mode Vocoder(SMV) vocoders via direct parameter transformation is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm converts the parameters of one coder to the other without going through the decoding and encoding processes. In transcoder from SMV to G.729A, LSP conversion algorithm, pitch delay conversion algorithm and transcoding algorithm in lower rate are proposed, and in transcoder from G.729A to SMV, LSP conversion algorithm, pitch delay conversion algorithm and rate selection algorithm are proposed. Evaluation results show that while exhibiting better computational and delay characteristics, the proposed algorithm produces equivalent or Improved speech quality to that produced by the tandem transcoding algorithm.

Analysis of Phonatory Aerodynamic & E.G.G. during Passaggio of the Trained Male Singers (남성성악가의 Vocal Register Transition(Passaggio)시 공기역학적 변화와 EGG의 변화 연구)

  • Nam, Do-Hyun;Choi, Seong-Hee;Choi, Jae-Nam;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.1
    • /
    • pp.21-26
    • /
    • 2004
  • Vocal Register Transition(Passaggio) is one of the most important vocal technique for classically trined male singers(tenor). Passaggio is that it bridges the chest register to head register without a noticeable voice break. Vocalist gest the feeling that voice is not locked a particular register. The purpose of this study was to clarify the difference between easy($B_3$) tone and non passaggio(F#_4$) & passaggio(F#_4$). We selected 6 trained singers(tenor), who had more than 12.6 years of experience and were well trained in passaggio technique. Simulataneous measurement was performed frequency(F0), mean flow rate(MFR), intensity(I), and subglottal pressure(Psub) using a phonatory function analyzer(Nagashima) and Closed Quotient(CQ), Jitter, Shimmer, NHR a Electro-glottography(EGG) of Lx. Speech Studio(Laryngogrph Lt, London, UK) and vocal efficiency was calculated by Carroll's method. For the tenor, target tone/a/was measured in three conditions : 1) easy phonation : $B_3$, 2) high tone without passaggio : F#_4$, 3) high tone with passaggio : F#_4$). The results revealed that F0 of the target tones between non-passaggio group and passaggio group were not significantly different though higher is F0, higher is subglottal pressure. And also CQ, MFR, Psub were increased in passagio than nonpssagio but these values were not statistically different. This study concluded that passaggio is the vocal technique to make the same quality of tone between chest register and head register in tenor.

  • PDF

Matching Pursuit Sinusoidal Modeling with Damping Factor (Damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링)

  • Jeong, Gyu-Hyeok;Kim, Jong-Hark;Lim, Joung-Woo;Joo, Gi-Ho;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.1
    • /
    • pp.105-113
    • /
    • 2007
  • In this paper, we propose the matching pursuit with damping factors, a new sinusoidal model improving the matching pursuit, for the codecs based on sinusoidal model. The proposed model defines damping factors by using a correlativity of parameters between the current and adjacent frame, and estimates sinusoidal parameters more accurately in analysis frame by using the matching pursuit according to damping factor, and synthesizes the final signal. Then it is possible to model efficiently without interpolation schemes. The proposed sinusoidal model shows a better speech quality without an additional delay than the conventional sinusoidal model with interpolation methods. Through the SNR(signal to noise ratio), the MOS(Mean Opinion Score), LR(Itakura-Saito likelihood ratio), and CD(cepstral distance), we compare the performance of our model with that of matching pursuit using interpolation methods.

Effects of Respiration and Oral Motor Training based on Musical Elements and Singing on Voice of Healthy Elderly (음악요소와 노래 부르기를 활용한 호흡 및 구강훈련이 정상노인의 음성에 미치는 영향)

  • Jun, Hee-Un;Kim, Soo-Ji
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.380-387
    • /
    • 2011
  • This study was to investigate the effects of music-combined respiration and oral motor training on the voice of healthy elderly. 27 women attending a senior center in Seoul participated and were randomly assigned to the experimental (n = 16) and the control group (n = 11). Subjects attended music program(25 minutes per session) once a week for 4 weeks. For both groups, Fundamental Frequency (F0), Maximum Phonation Time (MPT) and Sequential Motion Rates (SMR) were measured using the Praat speech analysis program before and after the training. The results showed statistical significance in scores of intensity, F0, MPT, and SMR in the experimental group while only intensity was statistically significant in the control group. Considering that, the increasing life expectancy and growing number of older adults, their quality of life has been important. So this study suggests that the respiration and oral motor training would be effectively incorporated into training and services for this population.