• Title/Summary/Keyword: Speech improvement

Search Result 610, Processing Time 0.022 seconds

An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease (파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성)

  • Shin, Hee Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

Probabilistic Target Speech Detection and Its Application to Multi-Input-Based Speech Enhancement (확률적 목표 음성 검출을 통한 다채널 입력 기반 음성개선)

  • Lee, Young-Jae;Kim, Su-Hwan;Han, Seung-Ho;Han, Min-Soo;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.95-102
    • /
    • 2009
  • In this paper, an efficient target speech detection algorithm is proposed for the performance improvement of multi-input speech enhancement. Using the normalized cross correlation value between two selected channels, the proposed algorithm estimates the probabilistic distribution function of the value from the pure noise interval. Then, log-likelihoods are calculated with the function and the normalized cross correlation value to detect the target speech interval precisely. The detection results are applied to the generalized sidelobe canceller-based algorithm. Experimental results show that the proposed algorithm significantly improves the speech recognition performance and the signal-to-noise ratios.

  • PDF

Performance Improvement on Hearing Aids Via Environmental Noise Reduction (배경 잡음 제거를 통한 보청 시스템의 성능 향상)

  • 박선준;윤대희;김동욱;박영철
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2
    • /
    • pp.61-67
    • /
    • 2000
  • Recent progress in digital and VLSI technology has offered new possibility fer noticeable advance of hearing aids. Yet, environmental noise remains one of the major problems to hearing aid users. This paper describes results which speech recognition performance and speech discrimination performance was measured for listeners with sensorineural hearing loss, while listeners in speech-band noise. In addition, to ameliorate hearing-aided environments of hearing impaired listeners, environmental noise reduction using speech enhancement techniques are investigated as a front-end of conventional hearing aids. Speech enhancement techniques are implemented in a realtime system equipped with DSP board. The clinical test results suggest that the speech enhancement technique may work in synergy with gain functions fer the greater SNR improvement as the preprocessing algorithm of digital hearing aids.

  • PDF

Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech (채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung
    • MALSORI
    • /
    • no.44
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

Performance Improvement of Speech Recognition Based on Independent Component Analysis (독립성분분석법을 이용한 음성인식기의 성능향상)

  • 김창근;한학용;허강인
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2001.06a
    • /
    • pp.285-288
    • /
    • 2001
  • In this paper, we proposed new method of speech feature extraction using ICA(Independent Component Analysis) which minimized the dependency and correlation among speech signals on purpose to separate each component in the speech signal. ICA removes the repeating of data after finding the axis direction which has the greatest variance in input dimension. We verified improvement of speech recognition ability with training and recognition experiments when ICA compared with conventional mel-cepstrum features using HMM. Also, we can see that ICA dealt with the situation of recognition ability decline that is caused by environmental noise.

  • PDF

Robust Speech Recognition using Noise Compensation Method Based on Eigen - Environment (Eigen - Environment 잡음 보상 방법을 이용한 강인한 음성인식)

  • Song Hwa Jeon;Kim Hyung Soon
    • MALSORI
    • /
    • no.52
    • /
    • pp.145-160
    • /
    • 2004
  • In this paper, a new noise compensation method based on the eigenvoice framework in feature space is proposed to reduce the mismatch between training and testing environments. The difference between clean and noisy environments is represented by the linear combination of K eigenvectors that represent the variation among environments. In the proposed method, the performance improvement of speech recognition systems is largely affected by how to construct the noisy models and the bias vector set. In this paper, two methods, the one based on MAP adaptation method and the other using stereo DB, are proposed to construct the noisy models. In experiments using Aurora 2 DB, we obtained 44.86% relative improvement with eigen-environment method in comparison with baseline system. Especially, in clean condition training mode, our proposed method yielded 66.74% relative improvement, which is better performance than several methods previously proposed in Aurora project.

  • PDF

A study on the prosody generation in Korean speech synthesis using sentence structure analysis (구문분석을 이응한 한국어 음성합성의 운율생성 연구)

  • Beack Seune-Kwon;Kim Won-Cheol;Hahn Minsoo
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.37-40
    • /
    • 1999
  • In this paper, we presented the prosody analysis results of five selected words according to its usage in a sentence, i.e.. the part of sentence (PoS) while changing the type of sentences such as simple, conjugate, and complex sentences. The selected five Korean words were 'U-Ri-Na-Ra' 'Bul-Kuk-Sa', 'Uh-Muh-Ni', 'Han-Ra-San', and 'Cang-A-Ji'. These five words were used as a subjective, an objective, and an adverb in each simple, conjugate, and complex sentence. The pitch, energy, and duration of each word were then analyzed and used for the synthetic speech prosody Improvement. The subjective test on the prosody improvement showed that more than $50\%$ of our listeners are affirmative to the prosody Improvement of the synthetic speech.

  • PDF

Multi-resolution DenseNet based acoustic models for reverberant speech recognition (잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델)

  • Park, Sunchan;Jeong, Yongwon;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.

Speech treatment of velopharyngeal insufficiency using biofeedback technique with NM II; A case report (Nasometer 활용 바이오피드백 기법을 이용한 비인강폐쇄전환자의 치험 사례)

  • Yang Ji-Hyung;Choi Jin-Young
    • Korean Journal of Cleft Lip And Palate
    • /
    • v.8 no.1
    • /
    • pp.45-52
    • /
    • 2005
  • Velopharyngeal Insufficiency(VPI); the failure of velum, the lateral wall and the posterior pharyngeal wall to separate the nasal cavity from pharyngeal cavity during speech, can be caused by congenital conditions include cleft palate, submucous cleft palate and congenital palatal insufficiency. Speech problems of VPI are characterized by hypernasality, nasal air emission, increased nasal air flow and decreased intelligibility. These speech problems of VPI can be treated with the surgical procedure, the application of temporary prosthesis and speech therapy. Biofeedback technique with Nasometer is a speech treatment method of VPI that commonly used as one component of a comprehensive procedure for improvement of speech in patients with VPI. In this article describes a case of VPI treated by biofeedback technique with Nasometer; which showed satisfactory result in nasalance and formant analysis after the speech therapy during 9 months.

  • PDF

Performance Improvement of Robust Speaker Verification According to Various Standard Deviations of a Reference Distribution in Histogram Transformation (히스토그램 변환에서 기준분포의 표준편차 변경에 따른 강인한 화자인증 성능 개선)

  • Kwon, Chul-Hong
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.127-134
    • /
    • 2010
  • Additive noise and channel mismatch strongly degrade the performance of speaker verification systems, as they distort the features of speech. In this paper a histogram transformation technique is presented to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that their histogram is conformed to a reference distribution. The effect of different standard deviations for the reference distribution is investigated. Experimental results indicate that, in channel mismatched environments, the proposed technique offers significant improvements over existing techniques. We also verify performance improvement of the proposed method using statistics.

  • PDF