Search | Korea Science

An Enhanced Clarity of Husky Voice by Dissonant Frequency Filtering

Kang, Sang-Ki;Baek, Seong-Joon
- Speech Sciences
- /
- v.12 no.4
- /
- pp.71-76
- /
- 2005
There have been numerous studies on the enhancement of noisy speech signal. In this paper, we propose a new speech enhancement method, that is, a filtering of a dissonant frequency combined with noise suppression algorithm. The simulation results indicate that the proposed method provides a significant gain in voice clarity. Therefore if the proposed enhancement scheme is used as a pre-filter, the perceptual clarity of husky voice is greatly enhanced.
PDF

Robust speech quality enhancement method against background noise and packet loss at voice-over-IP receiver (배경잡음 및 패킷손실에 강인한 voice-over-IP 수신단 기반 음질향상 기법)

Kim, Gee Yeun;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.37 no.6
- /
- pp.512-517
- /
- 2018
Improving voice quality is a major concern in telecommunications. In this paper, we propose a robust speech quality enhancement against background noise and packet loss at VoIP (Voice-over-IP) receiver. The proposed method combines network jitter estimation based on hybrid Markov chain, adaptive playout scheduling using the estimated jitter, and speech enhancement based on restoration of amplitude and phase to enhance the quality of the speech signal arriving at the VoIP receiver over IP network. The experimental results show that the proposed method removes the background noise added to the speech signal before encoding at the sender side and provides the enhanced speech quality in an unstable network environment.
https://doi.org/10.7776/ASK.2018.37.6.512 인용 PDF KSCI HTML

Speech Enhancement Based on Voice/Unvoice Classification (유성음/무성음 분리를 이용한 잡음처리)

유창동
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.374-379
- /
- 2002
In this paper, a nobel method to reduce noise using voice/unvoice classification is proposed. Voice and unvoice are an important feature of speech and the proposed method processes noisy speech differently for each voice/unvoice part. Speech is classified into voice/unvoice using zero-crossing rate and energy, and a modified speech/noise dominant-decision is proposed based on voice/unvoice classification. The proposed method was tested on conditions of white noise and airplane noise, and on the basis of comparing segmental SNR with the existing method and listening to the enhanced speech, a performance of the proposed method was superior to that of the existing method.
PDF KSCI

Speech Enhancement for Voice commander in Car environment (차량환경에서 음성명령어기 사용을 위한 음성개선방법)

백승권;한민수;남승현;이봉호;함영권
- Journal of Broadcast Engineering
- /
- v.9 no.1
- /
- pp.9-16
- /
- 2004
In this paper, we present a speech enhancement method as a pre-processor for voice commander under car environment. For the friendly and safe use of voice commander in a running car, non-stationary audio signals such as music and non-candidate speech should be reduced. Ow technique is a two microphone-based one. It consists of two parts Blind Source Separation (BSS) and Kalman filtering. Firstly, BSS is operated as a spatial filter to deal with non-stationary signals and then car noise is reduced by kalman filtering as a temporal filter. Algorithm Performance is tested for speech recognition. And the results show that our two microphone-based technique can be a good candidate to a voice commander.
PDF KSCI

Signal Enhancement of a Variable Rate Vocoder with a Hybrid domain SNR Estimator

Park, Hyung Woo
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.962-977
- /
- 2019
The human voice is a convenient method of information transfer between different objects such as between men, men and machine, between machines. The development of information and communication technology, the voice has been able to transfer farther than before. The way to communicate, it is to convert the voice to another form, transmit it, and then reconvert it back to sound. In such a communication process, a vocoder is a method of converting and re-converting a voice and sound. The CELP (Code-Excited Linear Prediction) type vocoder, one of the voice codecs, is adapted as a standard codec since it provides high quality sound even though its transmission speed is relatively low. The EVRC (Enhanced Variable Rate CODEC) and QCELP (Qualcomm Code-Excited Linear Prediction), variable bit rate vocoders, are used for mobile phones in 3G environment. For the real-time implementation of a vocoder, the reduction of sound quality is a typical problem. To improve the sound quality, that is important to know the size and shape of noise. In the existing sound quality improvement method, the voice activated is detected or used, or statistical methods are used by the large mount of data. However, there is a disadvantage in that no noise can be detected, when there is a continuous signal or when a change in noise is large.This paper focused on finding a better way to decrease the reduction of sound quality in lower bit transmission environments. Based on simulation results, this study proposed a preprocessor application that estimates the SNR (Signal to Noise Ratio) using the spectral SNR estimation method. The SNR estimation method adopted the IMBE (Improved Multi-Band Excitation) instead of using the SNR, which is a continuous speech signal. Finally, this application improves the quality of the vocoder by enhancing sound quality adaptively.
https://doi.org/10.3837/tiis.2019.02.026 인용 PDF KSCI HTML

Feedback Active Noise Control Based Voice Enhancing Ear-Protection System

Moon, Seong-Pil;Chang, Tae-Gyu
- Journal of Electrical Engineering and Technology
- /
- v.12 no.4
- /
- pp.1627-1633
- /
- 2017
This paper proposes a voice enhancing ear-protection system which is based on feedback active noise control(FBANC). The proposed system selectively suppresses the background noise and preserves the talking voice by controlling the adaptive algorithm with the voice activity period detection module. The noise reduction performance of the proposed noise canceling algorithm is analytically derived for the two key performance affecting parameters, i.e., electro-acoustic coupling distance and noise bandwidth. The proposed system is also implemented with a floating-point DSP system and its performance is experimentally tested to compare with the analytically derived results. The achieved levels of noise reduction for the three different noise bandwidths cases, i.e., 10Hz, 50Hz, and 90Hz, are high to show 17.05dB, 10.54dB and 8.99dB, respectively. The feasibility of the proposed system is also shown by the peak noise reduction achieved more than 25dB while preserving the voice component in the frequency range between 200-800Hz.
https://doi.org/10.5370/JEET.2017.12.4.1627 인용 PDF KSCI

An Improved VAD Algorithm Employing Speech Enhancement Preprocessing and Threshold Updating (음성 향상 전처리와 문턱값 갱신을 적용한 향상된 음성검출 방법)

이윤창;안상식
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.11C
- /
- pp.1161-1168
- /
- 2003
In this paper, we propose an improved statistical model-based voice activity detection algorithm and threshold update method. We first improve signal-to-noise ratio by using speech enhancement preprocessing algorithm combined power subtraction method and matched filter, then apply it to LLR test optimum decision rule for improving the performance even in low SNR conditions. And we propose an adaptive threshold update method that was not concerned in any papers. We also perform extensive computer simulations to demonstrate the performance improvement of the proposed VAD algorithm employing the proposed speech enhancement preprocessing algorithm and adaptive threshold update method under various background noise environments. Finally we verify our results by comparing ITU-T G.729 Annex B.
PDF KSCI

Voice quality distinctions of the three-way stop contrast under prosodic strengthening in Korean

Jiyoung Jang;Sahyang Kim;Taehong Cho
- Phonetics and Speech Sciences
- /
- v.16 no.1
- /
- pp.17-24
- /
- 2024
The Korean three-way stop contrast (lenis, aspirated, fortis) is currently undergoing a sound change, such that the primary cue distinguishing lenis and aspirated stops is shifting from voice onset time (VOT) to F0. Despite recent discussions of this shift, research on voice quality, traditionally considered an additional cue signaling the contrast, remains sparse. This study investigated the extent to which the associated voice quality [as reflected in the acoustic measurements of H1^*-H2^*, H1^*- A1^*, and cepstral peak prominence (CPP)] contributes to the three-way stop contrast, and how the realization is conditioned by prominence- vs. boundary-induced prosodic strengthening amid the ongoing sound change. Results for 12 native Korean speakers indicate that there was a substantial distinction in voice quality among the three stop categories with the breathiness of the vowel being the greatest after the lenis, intermediate after the aspirated, and least after the fortis stops, indicating the role of voice quality in the maintenance of the three-way stop contrast. Furthermore, prosodic strengthening has different effects on the contrast and contributes to the enhancement of the phonological contrast contingent on whether it is induced by prominence or boundary.
https://doi.org/10.13064/KSSS.2024.16.1.017 인용 PDF

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

Seorim Hwang;Sung Wook Park;Youngcheol Park
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.2
- /
- pp.253-259
- /
- 2024
This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.
https://doi.org/10.7776/ASK.2024.43.2.253 인용 PDF

A study on Voice Recognition using Model Adaptation HMM for Mobile Environment (모델적응 HMM을 이용한 모바일환경에서의 음성인식에 관한 연구)

Ahn, Jong-Young;Kim, Sang-Bum;Kim, Su-Hoon;Hur, Kang-In
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.11 no.3
- /
- pp.175-179
- /
- 2011
In this paper, we propose the MA(Model Adaption) HMM that to use speech enhancement and feature compensation. Normally voice reference data is not consider for real noise data. This method is not to use estimated noise but we use real life environment noise data. And we applied this contaminated data for recognition reference model that suitable for noise environment. MAHMM is combined with surround noise when generating reference patten. We improved voice recognition rate at mobile environment to use MAHMM.
https://doi.org/10.7236/JIWIT.2011.11.3.175 인용 PDF KSCI

Search Result 82, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)