• 제목/요약/키워드: Speech improvement

검색결과 610건 처리시간 0.026초

파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성 (An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease)

  • 신희백;고도홍
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

확률적 목표 음성 검출을 통한 다채널 입력 기반 음성개선 (Probabilistic Target Speech Detection and Its Application to Multi-Input-Based Speech Enhancement)

  • 이영재;김수환;한승호;한민수;김영일;정상배
    • 말소리와 음성과학
    • /
    • 제1권3호
    • /
    • pp.95-102
    • /
    • 2009
  • In this paper, an efficient target speech detection algorithm is proposed for the performance improvement of multi-input speech enhancement. Using the normalized cross correlation value between two selected channels, the proposed algorithm estimates the probabilistic distribution function of the value from the pure noise interval. Then, log-likelihoods are calculated with the function and the normalized cross correlation value to detect the target speech interval precisely. The detection results are applied to the generalized sidelobe canceller-based algorithm. Experimental results show that the proposed algorithm significantly improves the speech recognition performance and the signal-to-noise ratios.

  • PDF

배경 잡음 제거를 통한 보청 시스템의 성능 향상 (Performance Improvement on Hearing Aids Via Environmental Noise Reduction)

  • 박선준;윤대희;김동욱;박영철
    • 한국음향학회지
    • /
    • 제19권2호
    • /
    • pp.61-67
    • /
    • 2000
  • 최근의 디지털 신호처리 기술과 집적 회로 설계 기술의 발달은 보청 시스템의 새로운 가능성을 제공하고 있다. 그러나, 배경 잡음은 여전히 많은 난청자가 호소하는 문제로 남아 있다. 본 논문에서는 임상 실험을 통하여 음성 대역 잡음 환경에서 감음신경성 난청자의 음성 인지 능력과 어음 변별력을 측정한 결과를 제시한다. 또한, 보다 향상된 보청 환경을 제공하기 위하여 보청 시스템의 전처리단으로써 음질 향상 기법을 이용하여 배경 잡음을 제거하였다. 음질 향상 기법은 DSP 보드를 이용하여 실시간 시스템으로 구현되었으며, 이를 이용하여 청력 검사를 실시하였다. 임상 실험을 실시한 결과, 음질 향상 기법은 배경 잡음을 제거함으로써 신호의 SNR을 개선시켜 보청 이득과 결합되어 감음신경성 난청자의 음성 변별력을 크게 향상시켰다.

  • PDF

채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상 (Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech)

  • 김민성;정성윤;손종목;배건성
    • 대한음성학회지:말소리
    • /
    • 제44호
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

독립성분분석법을 이용한 음성인식기의 성능향상 (Performance Improvement of Speech Recognition Based on Independent Component Analysis)

  • 김창근;한학용;허강인
    • 융합신호처리학회 학술대회논문집
    • /
    • 한국신호처리시스템학회 2001년도 하계 학술대회 논문집(KISPS SUMMER CONFERENCE 2001
    • /
    • pp.285-288
    • /
    • 2001
  • 본 논문에서는 신호간의 의존성과 관련성이 최소가 되도록 분리하는 독립성분분석 법을 이용하여 입력음성에서 변동량이 많은 방향으로 주축을 찾아 그 정보를 이용하여 데이터의 중복성을 제거한 후 음성특징벡터를 추출하는 방법을 제안한다. 학습 하고자하는 음성인식기의 음성에서 독립성분분석법을 이용하여 특징벡터를 추출하고 HMM 을 사용하여 기존의 음성특징벡터로 사용되는 mel-cepstrum과 비교하여 학습, 인식실험을 수행하였으며 제안한 방법에서 음성인식성능의 향상을 확인할 수 있었다. 또한, 인식시 주변여건에 따라 잡음에 의한 인식성능 저하에도 유연히 대처할 수 있음을 앞 수 있었다.

  • PDF

Eigen - Environment 잡음 보상 방법을 이용한 강인한 음성인식 (Robust Speech Recognition using Noise Compensation Method Based on Eigen - Environment)

  • 송화전;김형순
    • 대한음성학회지:말소리
    • /
    • 제52호
    • /
    • pp.145-160
    • /
    • 2004
  • In this paper, a new noise compensation method based on the eigenvoice framework in feature space is proposed to reduce the mismatch between training and testing environments. The difference between clean and noisy environments is represented by the linear combination of K eigenvectors that represent the variation among environments. In the proposed method, the performance improvement of speech recognition systems is largely affected by how to construct the noisy models and the bias vector set. In this paper, two methods, the one based on MAP adaptation method and the other using stereo DB, are proposed to construct the noisy models. In experiments using Aurora 2 DB, we obtained 44.86% relative improvement with eigen-environment method in comparison with baseline system. Especially, in clean condition training mode, our proposed method yielded 66.74% relative improvement, which is better performance than several methods previously proposed in Aurora project.

  • PDF

구문분석을 이응한 한국어 음성합성의 운율생성 연구 (A study on the prosody generation in Korean speech synthesis using sentence structure analysis)

  • 백승권;김원철;한민수
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1999년도 학술발표대회 논문집 제18권 2호
    • /
    • pp.37-40
    • /
    • 1999
  • In this paper, we presented the prosody analysis results of five selected words according to its usage in a sentence, i.e.. the part of sentence (PoS) while changing the type of sentences such as simple, conjugate, and complex sentences. The selected five Korean words were 'U-Ri-Na-Ra' 'Bul-Kuk-Sa', 'Uh-Muh-Ni', 'Han-Ra-San', and 'Cang-A-Ji'. These five words were used as a subjective, an objective, and an adverb in each simple, conjugate, and complex sentence. The pitch, energy, and duration of each word were then analyzed and used for the synthetic speech prosody Improvement. The subjective test on the prosody improvement showed that more than $50\%$ of our listeners are affirmative to the prosody Improvement of the synthetic speech.

  • PDF

잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델 (Multi-resolution DenseNet based acoustic models for reverberant speech recognition)

  • 박순찬;정용원;김형순
    • 말소리와 음성과학
    • /
    • 제10권1호
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.

Nasometer 활용 바이오피드백 기법을 이용한 비인강폐쇄전환자의 치험 사례 (Speech treatment of velopharyngeal insufficiency using biofeedback technique with NM II; A case report)

  • 양지형;최진영
    • 대한구순구개열학회지
    • /
    • 제8권1호
    • /
    • pp.45-52
    • /
    • 2005
  • Velopharyngeal Insufficiency(VPI); the failure of velum, the lateral wall and the posterior pharyngeal wall to separate the nasal cavity from pharyngeal cavity during speech, can be caused by congenital conditions include cleft palate, submucous cleft palate and congenital palatal insufficiency. Speech problems of VPI are characterized by hypernasality, nasal air emission, increased nasal air flow and decreased intelligibility. These speech problems of VPI can be treated with the surgical procedure, the application of temporary prosthesis and speech therapy. Biofeedback technique with Nasometer is a speech treatment method of VPI that commonly used as one component of a comprehensive procedure for improvement of speech in patients with VPI. In this article describes a case of VPI treated by biofeedback technique with Nasometer; which showed satisfactory result in nasalance and formant analysis after the speech therapy during 9 months.

  • PDF

히스토그램 변환에서 기준분포의 표준편차 변경에 따른 강인한 화자인증 성능 개선 (Performance Improvement of Robust Speaker Verification According to Various Standard Deviations of a Reference Distribution in Histogram Transformation)

  • 권철홍
    • 말소리와 음성과학
    • /
    • 제2권3호
    • /
    • pp.127-134
    • /
    • 2010
  • Additive noise and channel mismatch strongly degrade the performance of speaker verification systems, as they distort the features of speech. In this paper a histogram transformation technique is presented to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that their histogram is conformed to a reference distribution. The effect of different standard deviations for the reference distribution is investigated. Experimental results indicate that, in channel mismatched environments, the proposed technique offers significant improvements over existing techniques. We also verify performance improvement of the proposed method using statistics.

  • PDF