통합 검색 | Korea Science

심리음향모델에 근거한 음성개선 (Speech Enhancement Based on Psychoacoustic Model)

이진걸
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 2000년도 하계학술발표대회 논문집 제19권 1호
- /
- pp.337-338
- /
- 2000
The perceptual filter for speech enhancement was analytically derived where the frequency content of the input noisy signal was made the same as that of the estimated clean signal in auditory domain. However, the analytical derivation should rely on the deconvolution associated with the spreading function in the psychoacoustic model, which results in an ill-conditioned problem. In order to cope with the problem associated with the deconvolution, we propose a novel psychoacoustic model based speech enhancement filter whose principle is the same as the perceptual filter, however the filter is derived by a constrained optimization which provides solutions to the ill-conditioned problem.
PDF

아날로그 음성 비화기의 비도 및 음질 향상에 관한 연구 (A Study on the Improvements of Security and Quality for Analog Speech Scrambler)

공병구;조동호
- 전자공학회논문지B
- /
- 제30B권9호
- /
- pp.27-35
- /
- 1993
In this paper, a new algorithm for high level security and quality of speech is proposed. The algorithm is based on the rearrangement of the fast fourier transform (FFT) coefficients with pre and post filter process, hamming window and adaptive pseudo spectrum insertion. Then, the pre and post filters are used for the whitening of speech spectrum and the adaptive pseudo spectrum is inserted for the unclassification of silence/speech. Also, the hamming window technique is applied for the robustness to the syncronization error in the telephone line. According to the simulation results, it can be seen that the security of scrambled signal and the quality of descrambled signal have been improved fairly in both subjective and objective performance test and the new FFT scrambler is robust to the synchronization error.
PDF

TMS320C30을 이용한 단일채널 적응잡음제거기 구현 (Implementation of the single channel adaptive noise canceller using TMS320C30)

정성윤;우세정;손창희;배건성
- 음성과학
- /
- 제8권2호
- /
- pp.73-81
- /
- 2001
In this paper, we focus on the real time implementation of the single channel adaptive noise canceller(ANC) by using TMS320C30 EVM board. The implemented single channel adaptive noise canceller is based on a reference paper [1] in which it is simulated by using the recursive average magnitude difference function(AMDF) to get a properly delayed input speech on a sample basis as a reference signal and normalized least mean square(NLMS) algorithm. To certify results of the real time implementation, we measured the processing time of the ANC and enhancement ratio according to various signalto-noise ratios(SNRs). Experimental results demonstrate that the processing time of the speech signal of 32ms length with delay estimation of every 10 samples is about 26.3 ms, and almost the same performance as given in [1] is obtained with the implemented system.
PDF

Detecting Data which Represent Emotion Features from the Speech Signal

Park, Chang-Hyun;Sim, Kwee-Bo;Lee, Dong-Wook;Joo, Young-Hoon
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 2001년도 ICCAS
- /
- pp.138.1-138
- /
- 2001
Usually, when we take a conversation with another, we can know his emotion as well as his idea. Recently, some applications using speech recognition comes out , however, those can recognize only context of various informations which he(she) gave. In the future, machine familiar to human will be a requirement for more convenient life. Therefore, we need to get emotion features. In this paper, we´ll collect a multiplicity of reference data which represent emotion features from the speech signal. As our final target is to recognize emotion from a stream of speech, as such, we must be able to understand features that represent emotion. There are much emotions human can show. the delicate difference of emotions makes this recognition problem difficult.
PDF

다중 센서 융합 알고리즘을 이용한 감정인식 및 표현기법 (Emotion Recognition and Expression Method using Bi-Modal Sensor Fusion Algorithm)

주종태;장인훈;양현창;심귀보
- 제어로봇시스템학회논문지
- /
- 제13권8호
- /
- pp.754-759
- /
- 2007
In this paper, we proposed the Bi-Modal Sensor Fusion Algorithm which is the emotional recognition method that be able to classify 4 emotions (Happy, Sad, Angry, Surprise) by using facial image and speech signal together. We extract the feature vectors from speech signal using acoustic feature without language feature and classify emotional pattern using Neural-Network. We also make the feature selection of mouth, eyes and eyebrows from facial image. and extracted feature vectors that apply to Principal Component Analysis(PCA) remakes low dimension feature vector. So we proposed method to fused into result value of emotion recognition by using facial image and speech.
https://doi.org/10.5302/J.ICROS.2007.13.8.754 인용 PDF KSCI

가변 비트율 음성 부호화기의 성능분석 (Performance Analysis of A Variable Bit Rate Speech Coder)

임병관
- 전기학회논문지
- /
- 제62권12호
- /
- pp.1750-1754
- /
- 2013
A variable bit rate speech coder is presented. The coder is based on the observation that a speech signal can be viewed as a combination of piecewise linear signals in a short time period. The encoder detects the sample points where the slope of the signal changes, which are called the inflection points in this paper. The coder transmits the location and value for the detected inflection sample, but only the location information for the noninflection samples. In the decoder, the noninflection samples are estimated with interpolation of the received information. Several factors affecting the performance of the coder have been tested through simulation. Simulation results show that the linear interpolation produces 1 ~ 5 dB improvement over the cubic spline interpolation. And the -law companding does not provide any benefit when it is applied before the inflection detection. With low threshold values in the inflection point detection, the coder shows better MOS and more than 16 dB improvement in SNR compared to the continuously variable slope delta modulation (CVSDM).
https://doi.org/10.5370/KIEE.2013.62.12.1750 인용 PDF KSCI KPUBS HTML

Emotion Recognition based on Multiple Modalities

Kim, Dong-Ju;Lee, Hyeon-Gu;Hong, Kwang-Seok
- 융합신호처리학회논문지
- /
- 제12권4호
- /
- pp.228-236
- /
- 2011
Emotion recognition plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between humans and computer. Most of previous work on emotion recognition focused on extracting emotions from face, speech or EEG information separately. Therefore, a novel approach is presented in this paper, including face, speech and EEG, to recognize the human emotion. The individual matching scores obtained from face, speech, and EEG are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. In the experiment results, the proposed approach gives an improvement of more than 18.64% when compared to the most successful unimodal approach, and also provides better performance compared to approaches integrating two modalities each other. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.
PDF KSCI

Interactive Rehabilitation Support System for Dementia Patients

Kim, Sung-Ill
- 융합신호처리학회논문지
- /
- 제11권3호
- /
- pp.221-225
- /
- 2010
This paper presents the preliminary study of an interactive rehabilitation support system for both dementia patients and their caregivers, the goal of which is to improve the quality of life(QOL) of the patients suffering from dementia through virtual interaction. To achieve the virtual interaction, three kinds of recognition modules for speech, facial image and pen-mouse gesture are studied. The results of both practical tests and questionnaire surveys show that the proposed system had to be further improved, especially in both speech recognition and user interface for real-world applications. The surveys also revealed that the pen-mouse gesture recognition, as one of possible interactive aids, show us a probability to support weakness of speech recognition.
PDF KSCI

Wavelet Packet을 이용한 Network 상의 음성 코드에 관한 연구 (A Study of Speech Coding for the Transmission on Network by the Wavelet Packets)

백한욱;정진현
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2000년도 하계학술대회 논문집 D
- /
- pp.3028-3030
- /
- 2000
In general. a speech coding is dedicated to the compression performance or the speech quality. But. the speech coding in this paper is focused on the performance of flexible transmission to the, network speed. For this. the subbanding coding is needed. which is used the wavelet packet concept in the signal analysis. The extraction of each frequency-band is difficult to general signal analysis methods, after coding each band, the reconstruction of these is also a difficult problem. But. with the wavelet packet concept(perfect reconstruction) and its fast computation algorithm. the extraction of each band and the reconstruction are more natural. Also, this paper describes a direct solution of the voice transmission on network and implement this algorithm at the TCP/IP network environment of PC.
PDF

Rao-Blackwellized particle filter를 이용한 순차적 음성 강조 (Rao-Blackwellized Particle Filtering for Sequential Speech Enhancement)

박선호;최승진
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2006년도 한국컴퓨터종합학술대회 논문집 Vol.33 No.1 (B)
- /
- pp.151-153
- /
- 2006
we present a method of sequential speech enhancement, where we infer clean speech signal using a Rao-Blackwellized particle filter (RBPF), given a noise-contaminated observed signal. In contrast to Kalman filtering-based methods, we consider a non-Gaussian speech generative model that is based on the generalized auto-regressive (GAR) model. Model parameters are learned by a sequential Newton-Raphson expectation maximization (SNEM), incorporating the RBPF. Empirical comparison to Kalman filter, confirms the high performance of the proposed method.
PDF

검색결과 1,174건 처리시간 0.035초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)