Search | Korea Science

Speech Enhancement Based on Soft Decision for Effective Noise Suppression (효율적인 잡음억제를 위한 Soft Decision 기반의 음성향상 기법)

Lim Hyoung-Keun;Kim Yu-Jin;Chung Jae-Ho
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.47-50
- /
- 2000
비상관적인 가산잡음에 오염된 음성으로부터 향상된 음성을 얻기 위한 방법 중 Soft Decision에 근거한 음성 향상 기법이 뛰어난 성능을 가진다고 알려져 있다. Soft Decision은 주파수 영역에서 음성에 가산된 잡음을 처리하며, 잡음 환경에 대한 사전정보에 의존적이다. 본 연구에서는 Soft Decision을 근거로 음성에 가산된 잡음신호를 비선형 처리를 하여 효과적으로 음성에 포함된 잡음을 추정하도록 하였으며, 잡음환경에 대한 사전 정보 없이 효율적으로 잡음을 억제하는 방법을 제안한다. 본 연구에서 제안한 음성향상 기법은 주관적인 음질평가에서 기존의 방법들보다 나은 성능을 나타내었다
PDF

A Nonuniform Sampling Technique and Its Application to Speech Coding (비균등 표본화 기법과 음성 부호화로의 응용)

Iem, Byeong-Gwan
- Journal of the Korean Institute of Intelligent Systems
- /
- v.24 no.1
- /
- pp.28-32
- /
- 2014
For a signal such as speech showing piece-wise linear shape in a very short time period, a nonuniform sampling method based on the inflection point detection (IPD) is proposed to reduce data rate. The method exploits the geometrical characteristics of signal further than the existing local maxima/minima detection (MMD) based sampling method. As results, the reconstructed signal by the interpolation of the IPD based sampled data resembles the original speech more. Computer simulation shows that the proposed IPD based method produces about 9~23 dB improvement over the existing MMD method. To show the usefulness of the IPD technique, it is applied to speech coding, and compared to the continuously variable slope delta modulation (CVSD). The nonuniformly sampled data is binary coded with one bit flag set "1". Noninflection samples are not sent, but only flag bits set 0 are sent. The method shows 0.3 ~ 9 dB SNR and 0.5 ~ 1.3 mean opinion score (MOS) improvements over the CVSD.
https://doi.org/10.5391/JKIIS.2014.24.1.028 인용 PDF KSCI

A Study on Heart Disease Diagnosis by Speech Analysis (음성신호분석에 의한 심장질환진단 방법에 관한 연구)

Cho, Dong-Uk;Kim, Bong-Hyun;Kim, Seung-Youn
- Proceedings of the Korea Information Processing Society Conference
- /
- 2005.11a
- /
- pp.557-560
- /
- 2005
서양의학은 다양한 진단 기기들의 개발로 인해 발전을 거듭하고 있는 반면, 한의학은 임상의의 직관에 의존하여 환자들에게 시각적이며 객관적으로 질병의 상태를 표현해 줄 수 있는 기기들이 부족하다. 즉, 질병에 대한 진단 결과가 시각화, 객관화될 수 있다면 한의학에 대한 진단 의존도는 향상되게 된다. 이를 위해 본 논문에서는 한방 진단 방법의 신뢰성과 정확성을 높이기 위해 한의학에서 몸을 다스리는 중심기관이며, 생명과 정신의 근원처로 알려진 심장과 음성 신호와의 관계를 한방에 기초하여 규명하고 분석하고자 한다. 특히 심장은 인체의 기관 중 혀와 관련되어 있어 음성과 연계하여 생각하면 심장질환자는 혓소리의 발음이 불명확함에 초점을 맞추어 심장 질환 유무를 판단하는 방법을 제안하고자 한다. 끝으로 실험에 의해 제안한 방법의 유용성을 입증하고자 한다.
PDF

A car number retrieving system using speech recognition for PDA (PDA상에서 음성인식을 이용한 차량번호 조회시스템)

김우성;김동환;윤재선;홍광석
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2001.06a
- /
- pp.281-284
- /
- 2001
In this paper, we present a car number retrieving system using speech recogntion and speech synthesis for PDA. This system consist of 4-digit numbers and command speech recognition as well its speech synthesis. Experiment results showed 4-digit numbers recognition rate 97% and commands recognition 99% through speaker-independent method.
PDF

Automatic Syllable Segmentation Algorithm in Noise Additional Continuous Speech (잡음이 첨가된 연속음성에서의 자동 음절분할 알고리즘)

Kim, Young-Sub;Cha, Young-Dong;Kim, Chang-Keun;Lee, Kwang-Seok;Hur, Kang-In
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2006.06a
- /
- pp.17-20
- /
- 2006
본 논문에서는 잡음이 첨가된 연속음성에서의 자동 음절분할을 위해 기존에 사용되고 있는 특징 파라미터인 단구간 에너지 이외에 잡음에 강인한 특성을 가지고 있는 새로운 특징인 스펙트럼 밀도비교척도와 의사역행렬을 이용한 선형판별함수를 제안한다. 기존에 사용되는 단구간 에너지는 잡음이 없는 환경에서는 좋은 성능을 나타내지만 잡음환경에서는 그렇지 못하다. 반면에 논문에서 제안한 척도들은 반대의 성능을 가지므로 주변잡음의 크기에 따라 각각의 파라미터를 적절한 가중치로 조합하는 음절구간 결정함수와 유한상태 머신을 추가로 사용면 무 잡음 환경뿐만 아니라, 잡음이 첨가된 연속음성에서도 일정수준 이상의 음절구간을 분리해 낼 수 있다.
PDF

Automatic Dialog System for the Elderly with Dementia (치매노인을 위한 자동대화시스템)

Kim, Sung-ill;Joo, Chang-bok;Shin, Wee-jae
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2003.06a
- /
- pp.137-140
- /
- 2003
본 연구는 치매노인환자의 생활의 질을 향상시키기 위한 대화시스템의 개발에 목표를 둔다. 제안된 시스템은 주로 세 가지 모듈, 즉, 음성인식, 시간테이블에 의해 구분된 대화 데이터베이스의 자동검색, 그리고 간호사의 녹음음성에 의한 응답 등으로 구성되어 있다. 우선 치매환자가 요양시설에서 자주 발화하는 대화의 내용을 조사하고, 그들의 발화 음성을 인식하고 적절히 응답하도록 구성하였다. 시스템의 평가를 위해서 시스템이 도입되었을 때와 도입되지 않았을 때를 비교, 조사하였다. 시스템이 도입되지 않았을 때는 간호사가 자유로이 케어서비스를 행할 수 있도록 하였다. 비디오 촬영을 통해서 대상자의 행동 및 반응을 조사한 결과, 치매환자의 요구를 충족시키는데 있어서 대화 시스템이 간호사들보다 더 응답적이었다는 것을 알 수 있었다. 게다가, 제안된 시스템은 상호 대화에 있어서 환자가 더 많이 말하도록 유도함을 알 수 있었다.
PDF

Recognition Performance Comparison to Various Features for Speech Recognizer Using Support Vector Machine (음성 인식기를 위한 다양한 특징 파라메터의 SVM 인식 성능 비교)

김평환;박정원;김창근;이광석;허강인
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2003.06a
- /
- pp.78-81
- /
- 2003
본 논문은 SVM(support vector machine)을 이용한 음성인식기에 대해 효과적인 특징 파라메터를 제안한다. SVM은 특징 공간에서 비선형 경계를 찾아 분류하는 방법으로 적은 학습 데이터에서도 좋은 분류 성능을 나타낸다고 알려져 있으며 최적의 특징 파라메터를 선택하기 위해 본 논문에서는 SVM을 이용한 음성인식기를 사용하여 PCA(principal component analysis), ICA(independent component analysis) 알고리즘을 적용하여 MFCC(met frequency cepstrum coefficient)의 특징 공간을 변화시키면서 각각의 인식 성능을 비교 검토하였다. 실험 결과 ICA에 의한 특징 파라메터가 가장 우수한 성능을 나타내었으며 특징 공간에서 각 클래스의 분포도 또한 ICA가 가장 높은 선형 분별성을 나타내었다.
PDF

On a Pitch Alteration Method using Scaling the Harmonics Compensated with the Phase for Speech Synthesis (위상 보상된 고조파 스케일링에 의한 음성합성용 피치변경법)

Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.6
- /
- pp.91-97
- /
- 1994
In speech processing, the waveform codings are concerned with simply preserving the waveform of signal through a redundancy reduction process. In the case of speech synthesis, the waveform codings with high quality are mainly used to the synthesis by analysis. Because the parameters of this coding are not classified as both excitation and vocal tract, it is difficult to apply the waveform coding to the synthesis by rule. Thus, in order to apply the waveform coding to synthesis by rule, it is necessary to alter the pitches. In this paper, we proposed a new pitch alteration method that can change the pitch period in waveform coding by dividing the speech signals into the vocal tract and excitation parameters. This method is a time-frequency domain method preserving the phase component of the waveform in time domain and the magnitude component in frequency domain. Thus, it is possible that the waveform coding is carried out the synthesis by rule in speech processing. In case of using the algorithm, we can obtain spectrum distortion with $2.94\%$. That is, the spectrum distortion is decreased more $5.06\%$ than that of the pitch alteration method in time domain.
PDF

A Study on the Effectiveness of the Lungs Hand Acupuncture Based on Bio Signal Analysis (생체신호분석 기술을 적용한 폐 수지침 요법에 대한 효과성 연구)

Kim, Bong-Hyun;Cho, Dong-Uk
- The KIPS Transactions:PartB
- /
- v.19B no.2
- /
- pp.77-82
- /
- 2012
We carried out study to prove effectiveness as stimulating corresponding points to lung in hand to experiment applied analysis parameters for image and audio signals in this paper. To this end we collected facial image and voice before and after stimulating corresponding points to lung in hand to a male 20s 25 people. In addition, we analyzed change color, voice energy and speaking rate of right cheek area corresponding points to lung to suggest the theory of the Oriental medicine diagnosis based on data collected. As a result, after performing hand acupuncture, L value of right cheek area decreased average 2.33 and a value b value increased 0.76, 0.97 on average. In addition, size of voice energy increased average 0.42, speaking rate decreased average 0.07. In other words, effect of lung function was improved using hand acupuncture corresponding points to lung.
https://doi.org/10.3745/KIPSTB.2012.19B.2.077 인용 PDF KSCI

Speech Spectrum Enhancement Combined with Frequency-weighted Spectrum Shaping Filter and Wiener Filter (주파수가중 스펙트럼성형필터와 위너필터를 결합한 음성 스펙트럼 강조)

Choi, Jae-Seung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.10
- /
- pp.1867-1872
- /
- 2016
In the area of digital signal processing, it is necessary to improve the quality of the speech signal after removing the background noise which exists in a various real environments. The important thing to consider when removing the background noise acoustically is that to solve the problem, depending on the information of the human auditory mechanism is mainly the amplitude spectrum of the speech signal. This paper introduces the characteristics of a frequency-weighted spectrum shaping filter for the extraction of the amplitude spectrum of the speech signal with the primary purpose. Therefore, this paper proposes an algorithm using the methods of a Wiener filter and the frequency-weighted spectrum shaping filter according to the acoustic model, after extracted the amplitude spectral information in the noisy speech signal. The spectral distortion (SD) output of the proposed algorithm is experimentally improved more than 5.28 dB compared to a conventional method.
https://doi.org/10.6109/jkiice.2016.20.10.1867 인용 PDF KSCI

Search Result 474, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)