Search | Korea Science

Speech Enhancement Using Lip Information and SFM (입술정보 및 SFM을 이용한 음성의 음질향상알고리듬)

Baek, Seong-Joon;Kim, Jin-Young
- Speech Sciences
- /
- v.10 no.2
- /
- pp.77-84
- /
- 2003
In this research, we seek the beginning of the speech and detect the stationary speech region using lip information. Performing running average of the estimated speech signal in the stationary region, we reduce the effect of musical noise which is inherent to the conventional MlMSE (Minimum Mean Square Error) speech enhancement algorithm. In addition to it, SFM (Spectral Flatness Measure) is incorporated to reduce the speech signal estimation error due to speaking habit and some lacking lip information. The proposed algorithm with Wiener filtering shows the superior performance to the conventional methods according to MOS (Mean Opinion Score) test.
PDF

Adaptive echo canceller combined with speech coder for mobile communication systems (이동통신 시스템을 위한 음성 부호화기와 결합된 적응 반향제거기에 관한 연구)

이인성;박영남
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.23 no.7
- /
- pp.1650-1658
- /
- 1998
This paper describes how to remove echoes effectively using speech parameter information provided form speech coder. More specially, the proposed adaptive echo canceller utilizes the excitation signal or linearly predicted error signal instead of output speech signal of vocoder as the input signal for adaptation algorithm. The normalized least mean ssquare(NLMS) algorithm is used for the adaptive echo canceller. The proposed algorithm showed a fast convergece charactersitcis in the sinulatio compared to the conventional method. Specially, the proposed echo canceller utilizing the excitation signal of speech coder showed about four times fast convergence speed over the echo canceller utilizing the output speech signal of the speech coder for the adaptation input.
PDF

Performance Comparison on Speech Codecs for Digital Watermarking Applications

Mamongkol, Y.;Amornraksa, T.
- Proceedings of the IEEK Conference
- /
- 2002.07a
- /
- pp.466-469
- /
- 2002
Using intelligent information contained within the speech to identify the specific hidden data in the watermarked multimedia data is considered to be an efficient method to achieve the speech digital watermarking. This paper presents the performance comparison between various types of speech codec in order to determine an appropriate one to be used in digital watermarking applications. In the experiments, the speech signal encoded by four different types of speech codec, namely CELP, GSM, SBC and G.723.1codecs is embedded into a grayscale image, and theirs performance in term of speech recognition are compared. The method for embedding the speech signal into the host data is borrowed from a watermarking method based on the zerotrees of wavelet packet coefficients. To evaluate efficiency of the speech codec used in watermarking applications, the speech signal after being extracted from the attacked watermarked image will be played back to the listeners, and then be justified whether its content is intelligible or not.
PDF

Analysis of Transient Features in Speech Signal by Estimating the Short-term Energy and Inflection points (변곡점 및 단구간 에너지평가에 의한 음성의 천이구간 특징분석)

Choi, I.H.;Jang, S.K.;Cha, T.H.;Choi, U.S.;Kim, C.S.
- Speech Sciences
- /
- v.3
- /
- pp.156-166
- /
- 1998
In this paper, I would like to propose a dividing method by estimating the inflection points and the average magnitude energy in speech signals. The method proposed in this paper gave not only a satisfactory solution for the problems on dividing method by zero-crossing rate, but could estimate the feature of the transient period after dividing the starting point and transient period in speech signals before steady state. In the results of the experiment carried out with monosyllabic speech, it was found that even through speech samples indicated in D.C. level, the staring and ending point of the speech signals were exactly divided by the method. In addition to the results, I could compare with the features, such as the length of transient period, the short term energy, the frequency characteristics, in each speech signal.
PDF

Effects of the Types of Noise and Signal-to-Noise Ratios on Speech Intelligibility in Dysarthria (소음 유형과 신호대잡음비가 마비말장애인의 말명료도에 미치는 영향)

Lee, Young-Mee;Sim, Hyun-Sub;Sung, Jee-Eun
- Phonetics and Speech Sciences
- /
- v.3 no.4
- /
- pp.117-124
- /
- 2011
This study investigated the effects of the types of noise and signal to noise ratios (SNRs) on speech intelligibility of an adult with dysartrhia. Speech intelligibility was judged by 48 naive listeners using a word transcription task. Repeated measures design was used with the types of noise (multi-talker babble/environmental noise) and SNRs (0, +10 dB, +20 dB) as within-subject factors. The dependent measure was the percentage of correctly transcribed words. Results revealed that two main effects were statistically significant. Listeners performed significantly worse in the multi-talker babble condition than the environmental noise condition, and they performed significantly better at higher levels of SNRs. The current results suggested that the multi-talker babble and lower level of SNRs decreased the speech intelligibility of adults with dysarthria, and speech-language pathologists should consider environmental factors such as the types of noise and SNRs in evaluating speech intelligibility of adults with dysarthria.
PDF

Classification of Speech and Car Noise Signals using the Slope of Autocovariances in Frequency Domain (주파수 영역 자기 공분산 기울기를 이용한 음성과 자동차 소음 신호의 구분)

Kim, Seon-Il
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.10
- /
- pp.2093-2099
- /
- 2011
Speech signal and car noise signal such as muffler noise are segregated from the one which has both signals mixed using statistical method. To classify speech signal from the other in segregated signals, FFT coefficients were obtained for all segments of a signal where each segment consists of 128 elements of a signal. For several coefficients of FFT corresponding to the low frequencies of a signal, autocovariances are calculated between coefficients of same order of all segments of a signal. Then they were averaged over autocovariances. Linear equation was eatablished for the those autocovariances using the linear regression method for each siganl. The coefficient of the slope of the line gives reference to compare and decide what the speech signal is. It is what this paper proposes. The results show it is very useful.
https://doi.org/10.6109/jkiice.2011.15.10.2093 인용 PDF KSCI

Wavelet Packet Adaptive Noise Canceller with NLMS-SUM Method Combined Algorithm (MLMS-SUM Method LMS 결합 알고리듬을 적용한 웨이브렛 패킷 적응잡음제거기)

정의정;홍재근
- Proceedings of the IEEK Conference
- /
- 1998.10a
- /
- pp.1183-1186
- /
- 1998
Adaptive nois canceller can extract the noiseremoved spech in noisy speech signal by adapting the filter-coefficients to the background noise environment. A kind of LMS algorithm is one of the most popular adaptive algorithm for noise cancellation due to low complexity, good numerical property and the merit of easy implementation. However there is the matter of increasing misadjustment at voiced speech signal. Therefore the demanded speech signal may be extracted. In this paper, we propose a fast and noise robust wavelet packet adaptive noise canceller with NLMS-SUM method LMS combined algorithm. That is, we decompose the frequency of noisy speech signal at the base of the proposed analysis tree structure. NLMS algorithm in low frequency band can efficiently dliminate the effect of the low frequency noise and SUM method LMS algorithm at each high frequency band can remove the high frequency nosie. The proposed wavelet packet adaptive noise canceller is enhanced the more in SNR and according to Itakura-Satio(IS) distance, it is closer to the clean speech signal than any other previous adaptive noise canceller.
PDF

A Study on a New Pre-emphasis Method Using the Short-Term Energy Difference of Speech Signal (음성 신호의 다구간 에너지 차를 이용한 새로운 프리엠퍼시스 방법에 관한 연구)

Kim, Dong-Jun;Kim, Ju-Lee
- The Transactions of the Korean Institute of Electrical Engineers D
- /
- v.50 no.12
- /
- pp.590-596
- /
- 2001
The pre-emphasis is an essential process for speech signal processing. Widely used two methods are the typical method using a fixed value near unity and te optimal method using the autocorrelation ratio of the signal. This study proposes a new pre-emphasis method using the short-term energy difference of speech signal, which can effectively compensate the glottal source characteristics and lip radiation characteristics. Using the proposed pre-emphasis, speech analysis, such as spectrum estimation, formant detection, is performed and the results are compared with those of the conventional two pre-emphasis methods. The speech analysis with 5 single vowels showed that the proposed method enhanced the spectral shapes and gave nearly constant formant frequencies and could escape the overlapping of adjacent two formants. comparison with FFT spectra had verified the above results and showed the accuracy of the proposed method. The computational complexity of the proposed method reduced to about 50% of the optimal method.
PDF

A Study on Speech Separation in Cochannel using Sinusoidal Model (Sinusoidal Model을 이용한 Cochannel상에서의 음성분리에 관한 연구)

Park, Hyun-Gyu;Shin, Joong-In;Park, Sang-Hee
- Proceedings of the KIEE Conference
- /
- 1997.11a
- /
- pp.597-599
- /
- 1997
Cochannel speaker separation is employed when speech from two talkers has been summed into one signal and it is desirable to recover one or both of the speech signals from the composite signal. Cochannel speech occurs in many common situations such as when two AM signals containing speech are transmitted on the same frequency or when two people are speaking simultaneously (e. g., when talking on the telephone). In this paper, the method that separated the speech in such a situation is proposed. Especially, only the voiced sound of few sound states is separated. And the similarity of the signals by the cross correlation between the signals for exactness of original signal and separated signal is proved.
PDF

A Study on the Diagnosis of Laryngeal Diseases by Acoustic Signal Analysis (음향신호의 분석에 의한 후두질환의 진단에 관한 연구)

Jo, Cheol-Woo;Yang, Byong-Gon;Wang, Soo-Geon
- Speech Sciences
- /
- v.5 no.1
- /
- pp.151-165
- /
- 1999
This paper describes a series of researches to diagnose vocal diseases using the statistical method and the acoustic signal analysis method. Speech materials are collected at the hospital. Using the pathological database, the basic parameters for the diagnosis are obtained. Based on the statistical characteristics of the parameters, valid parameters are chosen and those are used to diagnose the pathological speech signal. Cepstrum is used to extract parameters which represents characteristics of pathological speech. 3 layered neural network is used to train and classify pathological speech into normal, benign and malignant case.
PDF

Search Result 1,172, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)