• Title/Summary/Keyword: 음성 특성

Search Result 1,835, Processing Time 0.026 seconds

Noise Reduction using Spectral Subtraction in the Discrete Wavelet Transform Domain (이산 웨이브렛 변환영역에서의 스펙트럼 차감법을 이용한 잡음제거)

  • 김현기;이상운;홍재근
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.4
    • /
    • pp.306-315
    • /
    • 2001
  • In noise reduction method from noisy speech for speech recognition in noisy environments, conventional spectral subtraction method has a disadvantage which distinction of noise and speech is difficult, and characteristic of noise can't be estimated accurately. Also, noise reduction method in the wavelet transform domain has a disadvantage which loss of signal is generated in the high frequency domain. In order to compensate theme disadvantage, this paper propose spectral subtraction method in continuous wavelet transform domain which speech and non- speech intervals is distinguished by standard deviation of wavelet coefficient, and signal is divided three scales at different scale. The proposed method extract accurately characteristic of noise in order to apply spectral subtraction method by end detection and band division. The proposed method shows better performance than noise reduction method using conventional spectral subtraction and wavelet transform from viewpoint signal to noise ratio and Itakura-Saito distance by experimental.

  • PDF

Acoustic Masking Effect That Can Be Occurred by Speech Contrast Enhancement in Hearing Aids (보청기에서 음성 대비 강조에 의해 발생할 수 있는 마스킹 현상)

  • Jeon, Y.Y.;Yang, D.G.;Bang, D.H.;Kil, S.K.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.1 no.1
    • /
    • pp.21-28
    • /
    • 2007
  • In most of hearing aids, amplification algorithms are used to compensate hearing loss, noise and feedback reduction algorithms are used and to increase the perception of speeches contrast enhancement algorithms are used. However, acoustic masking effect is occurred between formants if contrast is enhanced excessively. To confirm the masking effect in speeches, the experiment are composed of 6 tests; test pure tone test, speech reception test, word recognition test, pure tone masking test, formant pure tone masking test and speech masking test, and for objective evaluation, LLR is introduced. As a result of normal hearing subjects and hearing impaired subjects, more making is occurred in hearing impaired subjects than normal hearing subjects when using pure tone, and in the speech masking test, speech reception is also lower in hearing impaired subjects than in normal hearing subjects. This means that acoustic masking effect rather than distortion influences speech perception. So it is required to check the characteristics of masking effect before wearing a hearing aid and to apply this characteristics to fitting curve.

  • PDF

A Study on the Frequency Scaling Methods Using LSP Parameters Distribution Characteristics (LSP 파라미터 분포특성을 이용한 주파수대역 조절법에 관한 연구)

  • 민소연;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.304-309
    • /
    • 2002
  • We propose the computation reduction method of real root method that is mainly used in the CELP (Code Excited Linear Prediction) vocoder. The real root method is that if polynomial equations have the real roots, we are able to find those and transform them into LSP. However, this method takes much time to compute, because the root searching is processed sequentially in frequency region. In this paper, to reduce the computation time of real root, we compare the real root method with two methods. In first method, we use the mal scale of searching frequency region that is linear below 1 kHz and logarithmic above. In second method, The searching frequency region and searching interval are ordered by each coefficient's distribution. In order to compare real root method with proposed methods, we measured the following two. First, we compared the position of transformed LSP (Line Spectrum Pairs) parameters in the proposed methods with these of real root method. Second, we measured how long computation time is reduced. The experimental results of both methods that the searching time was reduced by about 47% in average without the change of LSP parameters.

An Adaptive Microphone Array with Linear Phase Response (선형 위상 특성을 갖는 적응 마이크로폰 어레이)

  • Kang, Hong-Gu;Youn, Dae-Hui;Cha, Il-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.53-60
    • /
    • 1992
  • Many adaptive beamforming methods have been studied for interference cancellation and speech signal enhancement in telephone conference and auditorium. Main aspect of adaptive beamforming methods for speech signal processing is different from radar, sonar and seismic signal processing because desire output signal should be apt to the human ear. Considering that phase of speech is quite insensible to the human ear, Sondhi proposed a nonlinear constrained optimization technique whose constraint was on the magnitude transfer function from the source to the output. In real environment the phase response of the speech signal affects the human auditorium system. So it is desirable to design linear phase system. In this paper, linear phase beamformer is proposed and sample processing algorithm is also proposed for real time consideration Simulation results show that the proposed algorithm yields more consistent beam patterns and deep nulls to the noise direction than Sondhi's.

  • PDF

Exploring the Applicability of Voice-based Psychological Counseling Agent (음성 기반 심리상담 에이전트의 활용 가능성 탐색 연구)

  • Kim, Ji Geun;Yang, Hyunjung;Lee, Ji-Won
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.7
    • /
    • pp.144-156
    • /
    • 2021
  • This study was conducted to explore important factors to consider when designing voice-based psychological counseling agents amid the increasing use of conversational agents in counseling and psychotherapy. 48 participants selected their preferred agent's voice among four types (young women and men, middle-aged women and men) and had a conversation with a psychological counseling agent. They also evaluated the reasons for voice selection, mood changes, perception of the agent's characteristics, and counseling outcomes. As a results, the agent's voice type selected according to the user's gender was not statistically significant. However, the qualitative analysis showed 'comfort' of the voice was an important factor. Next, the user's mood improved significantly after the conversation with the agent, which confirmed the intervention effect. Finally, it was found that expertness and attractiveness perceptions toward the agent contributed to the counseling outcomes. The implications of the study and suggestions for future research were discussed.

Design of a Variable half rate speech codec (가변율 half rate 음성 부호화기의 설계)

  • 성호상
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.293-296
    • /
    • 1998
  • 본 논문에서는 다양한 멀티미디어 서비스를 위해 가변율 half rate 음성 부호화기를 설계하였다. 유, 무성음과 묵음의 구분을 위해 본 논문에서는 프레임 에너지와 음성 파라메터들을 이용한 효과적인 voicing 결정 알고리즘을 사용하였다. 유성음을 위한 half rate 음성 부호화기는 저속에서 좋은 특성을 보이는 generalized AbS구조를 이용하였다. LPC 계수는 LSP 계수로 변환한 후 predictive 2-stage VQ를 통해서 양자화하며, 여기 신호는 음질저하를 최소화하며 복잡도를 감소시킨 shift 방식의 대수적 고정 코드북 구조를 사용하고, 적응코드북과 여기코드북의 이득은 VQ로 양자화 하였다. 무성음을 위한 부호화기는 대부분이 유성음을 위한 부호화기와 동일하지만, 무성음에서는 피치간 상관도가 매우 낮으므로 피치 보간 방법을 사용하지 않고 개루프로 피치 lag를 찾은 후 전체 프레임에 사용한다. 1 kb/s 부호화기는 묵음 구간과 주변소음 구간에 사용되며 이 구간의 신호를 피치 성분이 미약한 주변소음들로 제한하고 이에 최적인 부음성 부호화기를 설계하였다. 최종적으로 완성된 가변율 half rate 부호화기는 voice activity factor(VAF)가 0.47인 시험음성에서 약 2.6 kb/s의 평균 전송률을 보였다. 주관적 음질 평가의 일환으로 IS-96 표준 코덱인 가변율 8 kb/s QCELP와 A-B preference 시험을 실시하였다. 시험 결과 평균전송률이 약 2배인 가변율 8 kb/s QCELP 보다 우수한 음질 성능을 보였다.

  • PDF

A Study on a Robust Voice Activity Detector Under the Noise Environment in the G,723.1 Vocoder (G.723.1 보코더에서 잡음환경에 강인한 음성활동구간 검출기에 관한 연구)

  • 이희원;장경아;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.173-181
    • /
    • 2002
  • Generally the one of serious problems in Voice Activity Detection (VAD) is speech region detection in noise environment. Therefore, this paper propose the new method using energy, lsp varation. As a result of processing time and speech quality of the proposed algorithm, the processing time is reduced due to the accurate detection of inactive period, and there is almot no difference in the subjective quality test. As a result of bit rate, proposed algorithm measures the number of VAD=1 and the result shows predominant reduction of bit rate as SNR of noisy speech is low (about 5∼10 dB).

A Study on the Korean Grapheme Phonetic Value Classification (한국어 자소 음가 분류에 관한 연구)

  • Yu Seung-Duk;Kim Hack-Jin;Kim Soon-Hyop
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.89-92
    • /
    • 2001
  • 본 논문에서는 한국어 대용량 음성인식 시스템의 기초가 되는 자소(grapheme)가 지니는 음가를 분류하였다. 한국어 자소를 음성-음운학적으로 조음 위치와 방법에 따라 분류하여, 그 음가 분석에 관한 연구와 함께 한국어 음성인식에서 앞으로 많이 논의될 청음음성학(auditory phonetics)에 대하여 연구하였다. 한국어는 발음상의 구조와 특성에 따라 음소 분리가 가능하여 초성, 중성, 종성 자소로 나눌 수 있다. 본 논문에서 초성은 자음음소 18개, 중성은 모음 음소(단모음, 이중모음) 17개, 그리고 'ㅅ' 추가 8종성체계의 자음음소로 하였다. 청음음성학적 PLU(Phoneme Like Unit)의 구분 근거는 우리가 맞춤법 표기에서 주로 많이 틀리는 자소(특히, 모음)는 그 음가가 유사한 것으로 판단을 하였으며, 그 유사음소를 기반으로 작성한 PLU는 자음에 'ㅅ' 종성을 추가하였고, 모음에 (ㅔ, ㅐ)를 하나로, (ㅒ, ㅖ)를 하나로, 그리고 모음(ㅚ, ㅙ, ㅞ)를 하나의 자소로 분류하였다. 혀의 위치와 조음 방법과 위치에 따라 분류한 자음과 모음의 자소를 HTK를 이용하여 HMM(Hidden Markov Model)의 자소 Clustering하여 그것의 음가를 찾는 결정트리를 검색하여 고립어인식과 핵심어 검출 시스템에 적용 실험한 결과 시스템의 성능이 향상되었다.

  • PDF

An Effective Transmission for Vice Traffic in UWB Mobile Ad Hoc Network (UWB 전술망에서의 효과적인 음성 데이터 전송)

  • Kim, Jong-Hwan;Koo, Myung-Hyun;Lee, Hyunseok;Shin, Jeong-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.4
    • /
    • pp.279-290
    • /
    • 2013
  • In this paper, we propose a transmission scheme of MAC protocol that enables secure voice communications by exploiting the wide spectrum and low signal strength characteristics of the ultra wide band technology. In addition, it also supports high level of terminal mobility by deploying mobile ad hoc network schemes. While most of existing UWB MAC protocols are operated as a synchronous mode, the proposed scheme operates in an asynchronous mode for supporting high mobility and sends voice packets without RTS/CTS control packets for efficient voice traffic transmission without retransmission. With simulation program, we prove that the proposed scheme satisfies the required voice quality and packet delivery time.

On the Importance of Tonal Features for Speech Emotion Recognition (음성 감정인식에서의 톤 정보의 중요성 연구)

  • Lee, Jung-In;Kang, Hong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.18 no.5
    • /
    • pp.713-721
    • /
    • 2013
  • This paper describes an efficiency of chroma based tonal features for speech emotion recognition. As the tonality caused by major or minor keys affects to the perception of musical mood, so the speech tonality affects the perception of the emotional states of spoken utterances. In order to justify this assertion with respect to tonality and emotion, subjective hearing tests are carried out by using synthesized signals generated from chroma features, and consequently show that the tonality contributes especially to the perception of the negative emotion such as anger and sad. In automatic emotion recognition tests, the modified chroma-based tonal features are shown to produce noticeable improvement of accuracy when they are supplemented to the conventional log-frequency power coefficient (LFPC)-based spectral features.