• Title/Summary/Keyword: Voice Problem

Search Result 338, Processing Time 0.028 seconds

Signal Enhancement of a Variable Rate Vocoder with a Hybrid domain SNR Estimator

  • Park, Hyung Woo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권2호
    • /
    • pp.962-977
    • /
    • 2019
  • The human voice is a convenient method of information transfer between different objects such as between men, men and machine, between machines. The development of information and communication technology, the voice has been able to transfer farther than before. The way to communicate, it is to convert the voice to another form, transmit it, and then reconvert it back to sound. In such a communication process, a vocoder is a method of converting and re-converting a voice and sound. The CELP (Code-Excited Linear Prediction) type vocoder, one of the voice codecs, is adapted as a standard codec since it provides high quality sound even though its transmission speed is relatively low. The EVRC (Enhanced Variable Rate CODEC) and QCELP (Qualcomm Code-Excited Linear Prediction), variable bit rate vocoders, are used for mobile phones in 3G environment. For the real-time implementation of a vocoder, the reduction of sound quality is a typical problem. To improve the sound quality, that is important to know the size and shape of noise. In the existing sound quality improvement method, the voice activated is detected or used, or statistical methods are used by the large mount of data. However, there is a disadvantage in that no noise can be detected, when there is a continuous signal or when a change in noise is large.This paper focused on finding a better way to decrease the reduction of sound quality in lower bit transmission environments. Based on simulation results, this study proposed a preprocessor application that estimates the SNR (Signal to Noise Ratio) using the spectral SNR estimation method. The SNR estimation method adopted the IMBE (Improved Multi-Band Excitation) instead of using the SNR, which is a continuous speech signal. Finally, this application improves the quality of the vocoder by enhancing sound quality adaptively.

VoIP 보안관련 주요기술에 대한 분석 (Analysis of key technologies related to VoIP security)

  • 나성훈;신현식
    • 한국전자통신학회논문지
    • /
    • 제5권4호
    • /
    • pp.385-390
    • /
    • 2010
  • VoIP(Voice over IP)서비스는 기존의 일반전화와 달리 인터넷망을 이용하여 음성 및 영상통화를 제공하는 서비스이다. VoIP의 사용이 보편화되고 발전되면서 보안의 위협도 계속 발전하고 있다. 이에대해 다양한 측면에서 VoIP의 보안에 대한 문제점 및 취약성에 대해 알아보고, 이 취약성을 해결하기 위한 방법인 보안 솔루션의 주요기술에 대해 알아보고자 한다.

음성 및 데이터 서비스를 지원하는 다중 반송파 코드 분할 다중 접속방식 시스템의 얼랑 용량 분석 (Analysis of Erlang Capacity for Multi-FA CDMA Systems Supporting Voice and Data Services)

  • 구인수;양정록;김태엽;김기선
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(1)
    • /
    • pp.37-40
    • /
    • 2000
  • As the number of CDMA subscribers increases, CDMA systems utilize more than one CDMA carrier In order to accommodate Increasing capacity requirement. In this paper, we present a new analytical method for evaluating the Erlang capacity of CDMA systems with multiple CDMA carriers. in the case of the algorithm proposed in 〔5〕, the calculation complexity for evaluating the call blocking probability Is increased proportionally to the sixth power of the number of used CDMA carriers when the CDMA system supports voice and data services. Consequently, It is Impractical to calculate Erlang capacity with the algorithm of 〔5〕especially when the number of used CDMA carriers is larger than 3. To resolve this problem, we propose a new analytical method for evaluating the Erlang capacity. The calculation complexity of the proposed method for evaluating call blocking probability is increased just proportionally to the second power of the number of used CDMA carriers when the CDMA systems support voice and data services.

  • PDF

Applying the Bi-level HMM for Robust Voice-activity Detection

  • Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
    • Journal of Electrical Engineering and Technology
    • /
    • 제12권1호
    • /
    • pp.373-377
    • /
    • 2017
  • This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.

Adaptive Post Processing of Nonlinear Amplified Sound Signal

  • Lee, Jae-Kyu;Choi, Jong-Suk;Seok, Cheong-Gyu;Kim, Mun-Sang
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.872-876
    • /
    • 2005
  • We propose a real-time post processing of nonlinear amplified signal to improve voice recognition in remote talk. In the previous research, we have found the nonlinear amplification has unique advantage for both the voice activity detection and the sound localization in remote talk. However, the original signal becomes distorted due to its nonlinear amplification and, as a result, the rest of sequence such as speech recognition show less satisfactorily results. To remedy this problem, we implement a linearization algorithm to recover the voice signal's linear characteristics after the localization has been done.

  • PDF

확률적 매칭을 사용한 음성 다이얼링 시스템 (Voice Dialing system using Stochastic Matching)

  • 김원구
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2004년도 춘계학술대회 학술발표 논문집 제14권 제1호
    • /
    • pp.515-518
    • /
    • 2004
  • This paper presents a method that improves the performance of the personal voice dialling system in which speaker Independent phoneme HMM's are used. Since the speaker independent phoneme HMM based voice dialing system uses only the phone transcription of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the system which uses the speaker dependent models due to the phone recognition errors generated when the speaker Independent models are used. In order to solve this problem, a new method that jointly estimates transformation vectors for the speaker adaptation and transcriptions from training utterances is presented. The biases and transcriptions are estimated iteratively from the training data of each user with maximum likelihood approach to the stochastic matching using speaker-independent phone models. Experimental result shows that the proposed method is superior to the conventional method which used transcriptions only.

  • PDF

성대마비와 성대구증의 강도 변화에 따른 최대발성지속시간 비교 (Comparison of Maximum Phonation Time Associated with the Changes in Vocal Intensity in Patients with Unilateral Vocal Fold Palsy and Sulcus Vocalis)

  • 최세진;최홍식;김재옥;최예린
    • 말소리와 음성과학
    • /
    • 제4권1호
    • /
    • pp.125-131
    • /
    • 2012
  • The patients with incomplete glottic closure have an important feature decreasing the maximum phonation time (MPT) because airflow rate or air leakage is greater than people without voice disorders. Also they can appear a problem in the intensity regulation. This study analyzed MPT difference based on the comfortable intensity and louder intensity and the correlation between MPT and respiration volume of unilateral vocal fold palsy (UVFP) and sulcus vocalis (SV) group. The twenty with UVFP, the 21 with SV, the 21 normal subjects measured MPT in /a/ vowel prolongation task with comfortable intensity and louder intensity and compared analysis by measuring FVC, $FEV_1$, $FEV_1/FVC$ to analyze the correlation between MPT and respiration volume. First, a comparison of MPT according to the intensity between groups is that MPT of the normal group was statistically significant long compared to the patient group in comfortable intensity, but MPT between groups was not statistically significant difference in the louder intensity. Second, an analysis of the correlation between MPT and respiration volume is that this was statistically significant correlation between MPT in comfortable intensity and MPT in louder intensity. But this did not show statistically significant correlation between intensity and respiration volume. This study can be supported the preceding study results deduced that shorting MPT of the patient group compared to the normal group was originated in the problem of laryngeal valving mechanism at the level of vocal folds rather than a problem of respiratory function. Also at the phonation by varying the intensity, the result can deduce that in the case of patient group, the length of MPT had been improved by increasing the glottal closure ratio in the louder intensity. These results can support the theoretical basis that should be applied to the clinicians by varying the intensity at the voice evaluation and voice therapy for the patients with the glottis incompetence.

최소 예약슬롯 보증 음성/데이타 집적 PRMA 프로토콜에 관한 연구 (A Study on the Voice/Data Integrated PRMA Protocol With the Minimum Reservation Slot Assured)

  • 김태규;조동호;윤용중
    • 한국통신학회논문지
    • /
    • 제18권2호
    • /
    • pp.250-260
    • /
    • 1993
  • PRMA(Packet Reservation Multiple Access) 프로토콜은 연집 트래픽 특성을 갖는 불특정다수의 단말기들이 공유채널을 엑세스하기 위해 서로 경쟁하는 환경하에서 음성 트래픽과 데이터 트래픽을 집적하여 서비스하는데 매우 효율적인 것으로 잘 알려져 있다. 그러나 PRMA 프로토콜에서는 부하가 커지면 예약채널의 용량이 영(Zero)으로 축소될 수도 있으므로 시스템이 불안정해지고 고부하 상태에서는 제대로 동작할 수 없다. 본 논문에서는 이러한 PRMA의 단점을 보완할 수 있으며 보다 안정되게 동작하는 음성/데이타 직접 PRMA 프로토콜을 제안하고, 프레임 및 슬롯구조를 제시하여, 제안된 프로토콜의 성능을 컴퓨터 시뮬레이션을 통하여 분석해 보았다. 시뮬레이션 결과, 기존의 방식에 비해 제안된 프로토콜이 보다 효율적으로 음성과 데이타를 집적할 수 있으며, 고부하 상태에서도 보다 안정되게 동작함을 알 수 있었다.

  • PDF

음성장애와 샘플유형에 따른 GRBAS 측정치 및 shimmer 비교 (Differences in GRBAS scales and shimmer according to vocal sample types in people with vocal disorders)

  • 신유정;홍기환;심현섭
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.149-155
    • /
    • 2011
  • The purpose of the present study was to identify the differences in GRBAS scales between vocal sample types (sustained vowels and connected speech) for specific laryngeal conditions (vocal nodules, vocal polyps and vocal paralysis) and the relations between GRBAS scale and Shimmer value in each vocal sample type. In this study, the total of 60 voice samples of 30 patients (10 vocal nodules, 10 vocal polyps, 10 vocal paralysis) were examined and MDVP (Multi-dimensional Voice Program) was used to analyze Shimmer value. Three listeners rated two types of samples which were sorted randomly based on GRBAS scale. Three-way ANOVA, one-way ANOVA and paired t-test were used. The outcome of this study was as follow. 1) GRBAS scales varied in vocal sample types. Listeners tended to assess voices as better quality when they listened connected speech rather than sustained vowels. 2) G score of GRBAS and Shimmer were positively correlated with statistical significance. This results show that 1) vocal specialists should consider the sample types in evaluating the severity of voice problem and 2) G score could be a simple and clear method.

  • PDF

A New Noise Reduction Method Based on Linear Prediction

  • Kawamura, Arata;Fujii, Kensaku;Itho, Yoshio;Fukui, Yutaka
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 ITC-CSCC -1
    • /
    • pp.260-263
    • /
    • 2000
  • A technique that uses linear prediction to achieve noise reduction in a voice signal which has been mixed with an ambient noise (Signal to Noise (S-N) ratio = about 0dB) is proposed. This noise reduction method which is based on the linear prediction estimates the voice spectrum while ignoring the spectrum of the noise. The performance of the noise reduction method is first examined using the transversal linear predictor filter. However, with this method there is deterioration in the tone quality of the predicted voice due to the low level of the S-N ratio. An additional processing circuit is then proposed so as to adjust the noise reduction circuit with an aim of improving the problem of tone deterioration. Next, we consider a practical application where the effects of round on errors arising from fixed-point computation has to be minimized. This minimization is achieved by using the lattice predictor filter which in comparison to the transversal type, is Down to be less sensitive to the round-off error associated with finite word length operations. Finally, we consider a practical application where noise reduction is necessary. In this noise reduction method, both the voice spectrum and the actual noise spectrum are estimated. Noise reduction is achieved by using the linear predictor filter which includes the control of the predictor filter coefficient’s update.

  • PDF