• Title/Summary/Keyword: speech quality

Search Result 807, Processing Time 0.023 seconds

The Performance Improvement of G.729 PLC in Situation of Consecutive Frame Loss (연속적인 프레임 손실 상황에서의 G.729 PLC 성능개선)

  • Hong, Seong-Hoon;Kim, Jin-Woo;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.34-40
    • /
    • 2010
  • As internet spread widely, various service which use the internet have been provided. One of the service is a internet phone. Its usage is increasing by the advantage of cost. But it has a falling off in quality of speech. because it use packet switching method while existing telephone use circuit switching method. Although vocoder use PLC (Packet Loss Concealment) algorithm, it has a weakness of continuous packet loss. In this paper, we propose methods to improve a lowering in quality of speech under continuous loss of packet by using PLC algorithm used in advanced G.729 and G.711. The proposed methods are LP (Linear Prediction) parameter interpolation, excitation signal reconstruction and excitation signal gain reconstruction. As a result, the proposed method shows superior performance about 11%.

Improvement of Packet Loss Concealment Algorithm by Utilizing Next Good Frame Info. (손실이후 프레임 정보에 의한 패킷손실은닉 알고리즘 개선)

  • Kim Jae-Hyun;Hahn Min-Soo
    • MALSORI
    • /
    • no.43
    • /
    • pp.101-112
    • /
    • 2002
  • In real time packetized voice application, missing packets are major source of voice quality degradation. Thus packet loss concealment (PLC) algorithms are needed to guarantee QoS of VoIP. In this paper, we describe packet loss concealment scheme utilizing the next good frame which follows loss packets. When this scheme is combined with other PLC algorithms, such as G.711 pitch waveform replication recommended by ITU-T LP based PLC algorithm, additional voice quality improvement is obtained for consecutive packet loss larger than 60 msec.

  • PDF

ON IMPROVING THE QUALITY OF RELP VOCODER

  • Oh, S.K.
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1985.10a
    • /
    • pp.79-86
    • /
    • 1985
  • Residual-ecited linear prediction vocoding is known to be one of the best approaches to speech coding in the range of 4.8 to 9.6 kbits/s. One problem associated with the RELP vocoder is that it often produces some roughness and tonal noise as the transmission rate becomes lower. In this paper, we investigate three methods to improve its quality. These include the multiband spectral folding method, the method of using both the spectrally folded signal and the pulsed ecitation signal, and the method of using both the multiband spectrally folded signal and the pulsed ecitation signal. It has been found that, among the three methods, the last one yields the best performance. It produces no roughness and little tonal noise.

  • PDF

A Study on the Acoustic Characteristics of Sexy Voice (섹시한 음성의 음향학적 특징 연구)

  • Jeong Ok-Ran;Jo Sung-Mi
    • MALSORI
    • /
    • no.57
    • /
    • pp.73-84
    • /
    • 2006
  • The purpose of this study was to explore the acoustic characteristics of sexy voice. In this study, we measured acoustic parameters (fundamental frequency, jitter, shimmer, and nasalance) of a sustained vowel sound produced by 40 actors (20 males and 20 females) and 40 non-actors (20 males and 20 females). Digital audio recordings were made in the sustained vowel |a| for acoustic analyses using Praat (version 4.1.9) and Nasal View (version 4.5). Twenty voice pathologists participated in the listening experiment and judged the degree of sexiness on a 7-point scale. The results showed that fundamental frequency, shimmer and nasalance had significant differences between actors and non-actors. The acoustic parameters of sexy voice matched perceptual aspects of a previous study: Low fundamental frequency-low pitch and high shimmer-husky voice. On the other hand, the nasalance score did not match that of the previous study: Decreased nasalance had a higher score on sexiness scale judged by the listeners. It would be desirable to study the voice quality by analyzing and controlling more acoustic and auditory parameters for practical applications in the future.

  • PDF

표준어 단순 모음의 세대간 차이에 대한 실험음성학적 분석 연구

  • Jeong Il-Jin
    • MALSORI
    • /
    • no.33_34
    • /
    • pp.111-125
    • /
    • 1997
  • This experimental phonetic analysis aims to describe standard Korean simple vowels with a view to presenting the vowel quality change from generation to generation, especially between the 50's and the 20's. This change reflects that the contemporary vowel system has both stable and unstable aspect: the former can be affirmed in the vowels with extreme positions in the vowel quadrilateral. and the latter in some vowels(e.g.,'ㅔ/ㅐ') which have the non-quantal vowel characteristics in the current vowel system. Formant values are measured to show these. And the results of acoustic analysis are presented graphically in the vowel quadrilateral for the convenience' sake. The comparison between the articulatory vowel quadrilateral and the acoustic one shows a lot concerning the current vowel quality change.

  • PDF

Packet Loss Recovery Using the AMR-WB Coder with FEC (FEC 기능을 추가한 AMR-WB 음성 부호화기를 이용한 패킷 손실 복구)

  • Park, In-Su;Hwang, Jeong-Joon;Lee, In-Sung
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.353-354
    • /
    • 2006
  • This paper suggests the packet loss recovery method to communicate in real-time in the Internet. To reduce the effects of packet loss, Forward Error Correction(FEC) that adds redundant information to voice packets can be used. The major cause for speech quality degradation in IP-networks is packet loss. So, We recovered single lossy packet by using FEC method and concealed continued errors. The proposed scheme is evaluated in the Gilbert Internet channel model. The high quality of audio maintained up to 30% packet loss.

  • PDF

Internet Information Service using Telephony and Fax, ITC-CSCC’2000

  • Jang, Young-Gun;Cho, Kyoung-Hwan
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.691-694
    • /
    • 2000
  • This paper is addressed to Internet telephony based service implementation. It describes an implementation method which uses ARS as a gateway which combines Internet and traditional public switched telephone network for Internet information service using telephony and fax, is different to traditional Internet telephony which provide enhanced speech quality and low cost functionality. This method allows telephony and/or fax user to get Internet information without additional Internet bill, Internet infrastructure and low connection quality from low signal bandwidth connected him. Implemented system is useful to a special kind information service such as climate information of Korea etc and simpler than WAP based service as for wireless mobile telephony user. We implement job opportunity information and advertisement service supported by Home page of Choong Book Small & Medium Business Administration and e-mail service supported by Korean Society for Rehabilitation of persons with Disabilities to demonstrate the system ability. As a result of test implementation, this service system works good fur blind persons and graduated persons without job, is expected to apply for special Internet information provider via Voice and Fax.

  • PDF

Improved MELP Coder Using Fourier Post Processing Compensation Method (퓨리에 후처리 보상 기법을 이용한 향상된 MELP 음성부호화기)

  • Ko Bong-Ok;Kim Chong-Kyo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.195-198
    • /
    • 2002
  • This paper presents an improved MELP Coder using Fourier magnitude compensation method chosen the new 2.4 kbit/s U.S. federal Standard. Although the MELP is quite good, it has some distortion for low-pitch male speakers. An improved MELP coder includes a post processing for the fourier magnitude model that allows the MELP to reconstruct the lower frequency spectrum more accurately and improve the speech quality. In this new compensation algorithm, the harmonic magnitudes in the low frequencies are adaptively modified by removing the effect of the two filters. Also, the bit rate of the improved MELP coder is the same as that of the Federal Standard MELP coder. formal quality tests show that the improved MELP coder was preferred over the Federal Standard MELP coder by $80.8\%$.

  • PDF

A comparison of acoustic measures among the microphone types for smartphone recordings in normal adults (정상 성인에서 스마트폰 녹음을 위한 마이크 유형 간 음향학적 측정치 비교)

  • Jeong In Park;Seung Jin Lee
    • Phonetics and Speech Sciences
    • /
    • v.16 no.2
    • /
    • pp.49-58
    • /
    • 2024
  • This study aimed to compare the acoustic measurements of speech samples recorded from individuals with normal voices using various devices: the Computerized Speech Lab (CSL), a unidirectional wired pin-microphone (WIRED) suitable for smartphones, the built-in omnidirectional microphone (SMART) of smartphones, and Bluetooth-connected wireless earphones, specifically the Galaxy Buds2 Pro (WIRELESS). This study included 40 normal adults (12 males and 28 females) who had not visited an otolaryngologist for respiratory diseases within the past three months. Participants performed sustained vowel /a/ phonation for four seconds and reading tasks with sentences ("Walk") and paragraphs ("Autumn") in a sound-treated booth. Recordings were simultaneously conducted using the four different devices and synchronized based on the CSL-recorded samples for analysis using the MDVP, ADSV, and VOXplot programs. Compared with CSL, the Cepstral Spectral Index of Dysphonia (CSIDV, CSIDS) and Acoustic Voice Quality Index (AVQI) values were lower in the WIRED and higher in the SMART. The opposite trend was observed for the L/H spectral ratios (SRV and SRS), and the WIRELESS demonstrated task-specific discrepancies. Furthermore, both the fundamental frequency (F0) and the cepstral peak prominence of the vowel samples (CPPV) had intraclass correlation coefficient (ICC) values above 0.9, indicating high reliability. These variables, F0 and CPPV were considered highly reliable for voice recordings across different microphone types. However, caution should be exercised when analyzing and interpreting variables such as the SR, CSID, and AVQI, which may be influenced by the type of microphone used.

Comparisons of voice quality parameter values measured with MDVP, Praat, and TF32 (MDVP, Praat, TF32에 따른 음향학적 측정치에 대한 비교)

  • Ko, Hye-Ju;Woo, Mee-Ryung;Choi, Yaelin
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.73-83
    • /
    • 2020
  • Measured values may differ between Multi-Dimensional Voice Program (MDVP), Praat, and Time-Frequency Analysis software (TF32), all of which are widely used in voice quality analysis, due to differences in the algorithms used in each analyzer. Therefore, this study aimed to compare the values of parameters of normal voice measured with each analyzer. After tokens of the vowel sound /a/ were collected from 35 normal adult subjects (19 male and 16 female), they were analyzed with MDVP, Praat, and TF32. The mean values obtained from Praat for jitter variables (J local, J abs, J rap, and J ppq), shimmer variables (S local, S dB, and S apq), and noise-to-harmonics ratio (NHR) were significantly lower than those from MDVP in both males and females (p<.01). The mean values of J local, J abs, and S local were significantly lower in the order MDVP, Praat, and TF32 in both genders. In conclusion, the measured values differed across voice analyzers due to the differences in the algorithms each analyzer uses. Therefore, it is important for clinicians to analyze pathologic voice after understanding the normal criteria used by each analyzer when they use a voice analyzer in clinical practice.