• Title/Summary/Keyword: Voice problem

Search Result 338, Processing Time 0.026 seconds

Signal Enhancement of a Variable Rate Vocoder with a Hybrid domain SNR Estimator

  • Park, Hyung Woo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.962-977
    • /
    • 2019
  • The human voice is a convenient method of information transfer between different objects such as between men, men and machine, between machines. The development of information and communication technology, the voice has been able to transfer farther than before. The way to communicate, it is to convert the voice to another form, transmit it, and then reconvert it back to sound. In such a communication process, a vocoder is a method of converting and re-converting a voice and sound. The CELP (Code-Excited Linear Prediction) type vocoder, one of the voice codecs, is adapted as a standard codec since it provides high quality sound even though its transmission speed is relatively low. The EVRC (Enhanced Variable Rate CODEC) and QCELP (Qualcomm Code-Excited Linear Prediction), variable bit rate vocoders, are used for mobile phones in 3G environment. For the real-time implementation of a vocoder, the reduction of sound quality is a typical problem. To improve the sound quality, that is important to know the size and shape of noise. In the existing sound quality improvement method, the voice activated is detected or used, or statistical methods are used by the large mount of data. However, there is a disadvantage in that no noise can be detected, when there is a continuous signal or when a change in noise is large.This paper focused on finding a better way to decrease the reduction of sound quality in lower bit transmission environments. Based on simulation results, this study proposed a preprocessor application that estimates the SNR (Signal to Noise Ratio) using the spectral SNR estimation method. The SNR estimation method adopted the IMBE (Improved Multi-Band Excitation) instead of using the SNR, which is a continuous speech signal. Finally, this application improves the quality of the vocoder by enhancing sound quality adaptively.

Analysis of key technologies related to VoIP security (VoIP 보안관련 주요기술에 대한 분석)

  • Rha, Sung-Hun;Shin, Hyun-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.4
    • /
    • pp.385-390
    • /
    • 2010
  • VoIP Service is provided voice & image call using internetwork unlike traditional call. VoIP usage is becoming generalization & development. As a result, threats of security are steadily increasing. Regarding this situation, we will investigate the security problem of VoIP in various aspects. Also We will investigate main technology of security solution method for solve this problem.

Analysis of Erlang Capacity for Multi-FA CDMA Systems Supporting Voice and Data Services (음성 및 데이터 서비스를 지원하는 다중 반송파 코드 분할 다중 접속방식 시스템의 얼랑 용량 분석)

  • 구인수;양정록;김태엽;김기선
    • Proceedings of the IEEK Conference
    • /
    • 2000.06a
    • /
    • pp.37-40
    • /
    • 2000
  • As the number of CDMA subscribers increases, CDMA systems utilize more than one CDMA carrier In order to accommodate Increasing capacity requirement. In this paper, we present a new analytical method for evaluating the Erlang capacity of CDMA systems with multiple CDMA carriers. in the case of the algorithm proposed in 〔5〕, the calculation complexity for evaluating the call blocking probability Is increased proportionally to the sixth power of the number of used CDMA carriers when the CDMA system supports voice and data services. Consequently, It is Impractical to calculate Erlang capacity with the algorithm of 〔5〕especially when the number of used CDMA carriers is larger than 3. To resolve this problem, we propose a new analytical method for evaluating the Erlang capacity. The calculation complexity of the proposed method for evaluating call blocking probability is increased just proportionally to the second power of the number of used CDMA carriers when the CDMA systems support voice and data services.

  • PDF

Applying the Bi-level HMM for Robust Voice-activity Detection

  • Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.1
    • /
    • pp.373-377
    • /
    • 2017
  • This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.

Adaptive Post Processing of Nonlinear Amplified Sound Signal

  • Lee, Jae-Kyu;Choi, Jong-Suk;Seok, Cheong-Gyu;Kim, Mun-Sang
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.872-876
    • /
    • 2005
  • We propose a real-time post processing of nonlinear amplified signal to improve voice recognition in remote talk. In the previous research, we have found the nonlinear amplification has unique advantage for both the voice activity detection and the sound localization in remote talk. However, the original signal becomes distorted due to its nonlinear amplification and, as a result, the rest of sequence such as speech recognition show less satisfactorily results. To remedy this problem, we implement a linearization algorithm to recover the voice signal's linear characteristics after the localization has been done.

  • PDF

Voice Dialing system using Stochastic Matching (확률적 매칭을 사용한 음성 다이얼링 시스템)

  • 김원구
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.04a
    • /
    • pp.515-518
    • /
    • 2004
  • This paper presents a method that improves the performance of the personal voice dialling system in which speaker Independent phoneme HMM's are used. Since the speaker independent phoneme HMM based voice dialing system uses only the phone transcription of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the system which uses the speaker dependent models due to the phone recognition errors generated when the speaker Independent models are used. In order to solve this problem, a new method that jointly estimates transformation vectors for the speaker adaptation and transcriptions from training utterances is presented. The biases and transcriptions are estimated iteratively from the training data of each user with maximum likelihood approach to the stochastic matching using speaker-independent phone models. Experimental result shows that the proposed method is superior to the conventional method which used transcriptions only.

  • PDF

Comparison of Maximum Phonation Time Associated with the Changes in Vocal Intensity in Patients with Unilateral Vocal Fold Palsy and Sulcus Vocalis (성대마비와 성대구증의 강도 변화에 따른 최대발성지속시간 비교)

  • Choi, Se-Jin;Choi, Hong-Shik;Kim, Jae-Ock;Choi, Yae-Lin
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.125-131
    • /
    • 2012
  • The patients with incomplete glottic closure have an important feature decreasing the maximum phonation time (MPT) because airflow rate or air leakage is greater than people without voice disorders. Also they can appear a problem in the intensity regulation. This study analyzed MPT difference based on the comfortable intensity and louder intensity and the correlation between MPT and respiration volume of unilateral vocal fold palsy (UVFP) and sulcus vocalis (SV) group. The twenty with UVFP, the 21 with SV, the 21 normal subjects measured MPT in /a/ vowel prolongation task with comfortable intensity and louder intensity and compared analysis by measuring FVC, $FEV_1$, $FEV_1/FVC$ to analyze the correlation between MPT and respiration volume. First, a comparison of MPT according to the intensity between groups is that MPT of the normal group was statistically significant long compared to the patient group in comfortable intensity, but MPT between groups was not statistically significant difference in the louder intensity. Second, an analysis of the correlation between MPT and respiration volume is that this was statistically significant correlation between MPT in comfortable intensity and MPT in louder intensity. But this did not show statistically significant correlation between intensity and respiration volume. This study can be supported the preceding study results deduced that shorting MPT of the patient group compared to the normal group was originated in the problem of laryngeal valving mechanism at the level of vocal folds rather than a problem of respiratory function. Also at the phonation by varying the intensity, the result can deduce that in the case of patient group, the length of MPT had been improved by increasing the glottal closure ratio in the louder intensity. These results can support the theoretical basis that should be applied to the clinicians by varying the intensity at the voice evaluation and voice therapy for the patients with the glottis incompetence.

A Study on the Voice/Data Integrated PRMA Protocol With the Minimum Reservation Slot Assured (최소 예약슬롯 보증 음성/데이타 집적 PRMA 프로토콜에 관한 연구)

  • 김태규;조동호;윤용중
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.2
    • /
    • pp.250-260
    • /
    • 1993
  • Packet Reservation Multiple Access(PRMA) protocol provides a very efficient mechanism for many number of voice and data terminals with bursty traffic characteristics to share efficiently a common transmission channel. This protocol, however, cannot operate in high load conditions. That is, there occurs a instability problem, since the reservation channel is allowed to shrink to zero. In this paper, a more stable PRMA protocol which can avoid such problems and integrates voice and data traffic efficiently is proposed. Also, the performance of the proposed protocol is analyzed by a computer simulation. According to the simulation results, it can be shown that the proposed protocol provides a more efficient mechanism for voice/data integration and ensures a more stable operation than conventional PRMA protocol in high load conditions.

  • PDF

Differences in GRBAS scales and shimmer according to vocal sample types in people with vocal disorders (음성장애와 샘플유형에 따른 GRBAS 측정치 및 shimmer 비교)

  • Shin, Yu-Jeong;Hong, Ki-Hwan;Sim, Hyun-Sub
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.149-155
    • /
    • 2011
  • The purpose of the present study was to identify the differences in GRBAS scales between vocal sample types (sustained vowels and connected speech) for specific laryngeal conditions (vocal nodules, vocal polyps and vocal paralysis) and the relations between GRBAS scale and Shimmer value in each vocal sample type. In this study, the total of 60 voice samples of 30 patients (10 vocal nodules, 10 vocal polyps, 10 vocal paralysis) were examined and MDVP (Multi-dimensional Voice Program) was used to analyze Shimmer value. Three listeners rated two types of samples which were sorted randomly based on GRBAS scale. Three-way ANOVA, one-way ANOVA and paired t-test were used. The outcome of this study was as follow. 1) GRBAS scales varied in vocal sample types. Listeners tended to assess voices as better quality when they listened connected speech rather than sustained vowels. 2) G score of GRBAS and Shimmer were positively correlated with statistical significance. This results show that 1) vocal specialists should consider the sample types in evaluating the severity of voice problem and 2) G score could be a simple and clear method.

  • PDF

A New Noise Reduction Method Based on Linear Prediction

  • Kawamura, Arata;Fujii, Kensaku;Itho, Yoshio;Fukui, Yutaka
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.260-263
    • /
    • 2000
  • A technique that uses linear prediction to achieve noise reduction in a voice signal which has been mixed with an ambient noise (Signal to Noise (S-N) ratio = about 0dB) is proposed. This noise reduction method which is based on the linear prediction estimates the voice spectrum while ignoring the spectrum of the noise. The performance of the noise reduction method is first examined using the transversal linear predictor filter. However, with this method there is deterioration in the tone quality of the predicted voice due to the low level of the S-N ratio. An additional processing circuit is then proposed so as to adjust the noise reduction circuit with an aim of improving the problem of tone deterioration. Next, we consider a practical application where the effects of round on errors arising from fixed-point computation has to be minimized. This minimization is achieved by using the lattice predictor filter which in comparison to the transversal type, is Down to be less sensitive to the round-off error associated with finite word length operations. Finally, we consider a practical application where noise reduction is necessary. In this noise reduction method, both the voice spectrum and the actual noise spectrum are estimated. Noise reduction is achieved by using the linear predictor filter which includes the control of the predictor filter coefficient’s update.

  • PDF