• 제목/요약/키워드: speech management

검색결과 257건 처리시간 0.032초

보툴리눔독소를 이용한 후두전적출술후 식도발성장애 및 식도이완불능증의 치료 (Botulinum Toxin Injection for Postlaryngectomy esophageal speech failure and Achalasia)

  • 최홍식;문형진;한재욱;서진원;김광문
    • 대한기관식도과학회지
    • /
    • 제3권2호
    • /
    • pp.302-306
    • /
    • 1997
  • Persistent pharygoesophageal spasm has been demonstrated to be responsible for poor speech rehabilitation after laryngectomy Management of these patients has included bougienage and pharyngeal neurectomy. Achalasia is a disorder of swallowing in which the lower esophageal sphincter fails to relax. Botulinum toxin injection of the upper esophageal sphincter or lower esophageal sphincter has been successfully used diagnostically and therapeutically for esophageal speech failure or achalasia. So, we report the use of botulinum toxin, a paralytic agent, for the treatment of these conditions.

  • PDF

심리 음향 켑스트럼 평균 차감법을 이용한 이동 전화망에서의 음질 평가 (Speech Quality Measure in a Mobile Communication System Using PLP Cepstral Distance with CMS)

  • 윤종진;박상욱;박영철;윤대희;차일환
    • 음성과학
    • /
    • 제6권
    • /
    • pp.163-179
    • /
    • 1999
  • For the set up, management and repair of a mobile communication system, continuous estimation of speech quality is required. Speech quality measurement can be conducted by listener's judgement in a subjective test such as MOS (Mean Opinion Score) test. However, this method is laborious, expensive and time-consuming, it is advisable to predict subjective speech quality via objective measures. This paper presents a robust objective speech quality measure, PLP-CMS (Perceptual Linear Predictive-Cepstral Mean Subtraction), which can predict subjective speech quality in mobile communication systems. PLP-CMS has a high correlation with subjective quality owing to PLP (Perceptual Linear Predictive) analysis and shows a robust performance not being influenced by PSTN (Public Switched Telephone Network) channel effects due to CMS (Cepstral Mean Subtraction). To prove the performance of our proposed algorithm, we carried out subjective and objective quality estimation on speech samples which are variously distorted in a real mobile communication system. As a result, we demonstrated that PLP-CMS has a higher correlation with subjective quality than PSQM (Perceptual Speech Quality Measure) and PLP-CD (Perceptual Linear Predictive-Cepstral Distance).

  • PDF

구개열환자의 언어관리 및 평가 (The Management and Evaluation of Speech in Cleft Palate Patients)

  • 신효근;김현기
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 2월 학술대회지
    • /
    • pp.23-40
    • /
    • 1996
  • The communicative disorders in cleft palate patients have relationship with the acoustic and He physiological phenomena. Particularily hypernasality is a parameter of cleft palate speech that has been studied by many clinicians and speech pathologists. The degree of hypernasality has been assessed by the listener,s judgement, but perceptual assessements have poor scientific reliability, so objective instruments have been needed to test hypernasality with diagnostics accuracy. This study was analyzed the nasalance score using a Nasometer for cleft palate patients. The simple vowels /a/, /i/, /e/ and the approximants /j/, /w/ were tested for the degree of hypernasality after operation. The phrases containing long and short duration times were used in this study to asses hypeernasality. Fiberopic views shows the open velopharyngeal port that resulted in hypernasality of cleft palate patients. The authors assert the important of the management of cleft palate patients.

  • PDF

포르만트 위치비교를 이용한 구개열 환자의 발음분석 (Sound Analysis of Cleft Platate Patinents Using Formant Position)

  • 김덕원;송철규
    • 대한의용생체공학회:의공학회지
    • /
    • 제11권2호
    • /
    • pp.283-288
    • /
    • 1990
  • As one of the main purpose of the physical management of cleft palate is to provide for the anatomic and physiologic requisites for speech, the speech must be as one of the criteria for determining when physical management has been achieved. But there is no objective methods to evaluate the speech of cleft palate patients. The authors tried to analyze the speech of adult cleft palate patients using sound spectrog raphy and compared with normal adults. The results were obtained as follows ; 1. In Vowels, cleft palate patients of both sexes showed reduction of frequency of the first and second formant as compared to normal. There was minimal difference in front vowels (i, e, ae) 2. In consonants, cleft palate patients showed reduction of frequency of the first formant in both sexes but reduction of frequency of the second formant was noticed only in fe- male patients. 3. There was no statistical difference in sound spectrograph between plosive, fricative, africative, nasal, and glide consonants.

  • PDF

디지털 방송을 위한 패치워크 기반 음성 워터마크 (Speech Watermark Based on Patchwork for Digital Broadcasting)

  • 여인권;김형중;최용희;김기섭
    • 방송공학회논문지
    • /
    • 제5권2호
    • /
    • pp.220-226
    • /
    • 2000
  • 본 논문에서는 방송용 음성에 워터마크를 삽입하는 방법을 제시했다. 디지털 방송에서는 오디오와 음성을 일부러 구별하지는 않는다. 그러나 교육방송에서는 음성의 중요성이 비디오나 오디오에 비해 훨씬 크고 컨텐츠에서 차지하는 비중도 높다. 디지털 방송에서 중요한 이슈 가운데 하나가 바로 불법복제에 대한 대비책이다. 이 논문에서는 음성용으로 변형한 오디오 워터마크의 성능과 한계에 대해 설명하고, 공격에 대한 내성 결과를 제시했다. 그리고 음성 워터마크 연구에서 해결해야 할 과제들을 제시했다.

  • PDF

얼굴영상과 음성을 이용한 멀티모달 감정인식 (Multimodal Emotion Recognition using Face Image and Speech)

  • 이현구;김동주
    • 디지털산업정보학회논문지
    • /
    • 제8권1호
    • /
    • pp.29-40
    • /
    • 2012
  • A challenging research issue that has been one of growing importance to those working in human-computer interaction are to endow a machine with an emotional intelligence. Thus, emotion recognition technology plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between human and computer. In this paper, we propose the multimodal emotion recognition system using face and speech to improve recognition performance. The distance measurement of the face-based emotion recognition is calculated by 2D-PCA of MCS-LBP image and nearest neighbor classifier, and also the likelihood measurement is obtained by Gaussian mixture model algorithm based on pitch and mel-frequency cepstral coefficient features in speech-based emotion recognition. The individual matching scores obtained from face and speech are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. Through experimental results, the proposed method exhibits improved recognition accuracy of about 11.25% to 19.75% when compared to the most uni-modal approach. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

가우시안 분포에서 Maximum Log Likelihood를 이용한 벡터 양자화 기반 음성 인식 성능 향상 (Vector Quantization based Speech Recognition Performance Improvement using Maximum Log Likelihood in Gaussian Distribution)

  • 정경용;오상엽
    • 디지털융복합연구
    • /
    • 제16권11호
    • /
    • pp.335-340
    • /
    • 2018
  • 정확한 인식률을 보이고 있는 상업적인 음성인식 시스템은 화자종속 고립데이터로부터 학습 모델을 사용한다. 그러나 잡음 환경에서 데이터양에 따라 음성인식의 성능이 저하되는 문제점이 있다. 본 논문에서는 가우시안 분포에서 Maximum Log Likelihood를 이용한 벡터 양자화 기반 음성 인식 성능 향상을 제안한다. 제안하는 방법은 음성에 대한 특징을 가지고 벡터 양자화와 Maximum Log Likelihood 음성 특징 추출 방법을 이용하여 유사 음성에 대한 음성 인식의 정확성을 높이는 최적 학습 모델 구성 방법이다. 이를 위해 HMM을 기반으로 음성 특징을 추출하는 방법을 사용한다. 제안하는 방법을 사용하여 기존 시스템에서 생성되어 사용되는 음성 모델에 대한 부정확한 음성 모델에 대한 정확성을 향상시킬 수 있으므로 음성 인식에 강인한 모델을 구성할 수 있다. 제안하는 방법은 음성 인식 시스템에서 향상된 인식의 정확도를 보인다.

분산 음성인식 시스템의 성능향상을 위한 음소 빈도 비율에 기반한 VQ 코드북 설계 (A VQ Codebook Design Based on Phonetic Distribution for Distributed Speech Recognition)

  • 오유리;윤재삼;이길호;김홍국;류창선;구명완
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.37-40
    • /
    • 2006
  • In this paper, we propose a VQ codebook design of speech recognition feature parameters in order to improve the performance of a distributed speech recognition system. For the context-dependent HMMs, a VQ codebook should be correlated with phonetic distributions in the training data for HMMs. Thus, we focus on a selection method of training data based on phonetic distribution instead of using all the training data for an efficient VQ codebook design. From the speech recognition experiments using the Aurora 4 database, the distributed speech recognition system employing a VQ codebook designed by the proposed method reduced the word error rate (WER) by 10% when compared with that using a VQ codebook trained with the whole training data.

  • PDF

자동차 환경내의 음성인식 자동 평가 플랫폼 연구 (A Study of Automatic Evaluation Platform for Speech Recognition Engine in the Vehicle Environment)

  • 이성재;강선미
    • 한국통신학회논문지
    • /
    • 제37권7C호
    • /
    • pp.538-543
    • /
    • 2012
  • 주행 중 차량내의 음성인터페이스 에서 음성인식기의 성능은 가장 중요한 부분이다. 본 논문은 차량내 음성인식기의 성능 평가를 자동화하기 위한 플랫폼의 개발에 대한 것이다. 개발된 플랫폼은 주 프로그램, 중계 프로그램 데이터베이스 관리, 통계산출 모듈로 구성된다. 성능 평가에 있어 실제 차량의 주행 조건을 고려한 시뮬레이션 환경이 구축되었고, 미리 녹음된 주행 노이즈와 발화자의 목소리를 마이크를 통해 입력하여 실험하였다. 실험 결과 제안하는 플랫폼에서 얻어진 음성인식 결과의 유효성이 입증되었다. 제안한 플랫폼으로 사용자는 음성인식의 자동화와 인식결과의 효율적인 관리 및 통계산출을 함으로서 차량 음성인식기의 평가를 효과적으로 진행할 수 있다.

음성대화시스템 워크벤취로서의 DialogStudio 개발 (DialogStudio: A Spoken Dialog System Workbench)

  • 정상근;이청재;이근배
    • 대한음성학회지:말소리
    • /
    • 제63호
    • /
    • pp.101-112
    • /
    • 2007
  • Spoken dialog system development includes many laborious and inefficient tasks. Since there are many components such as speech recognition, language understanding, dialog management and knowledge management in a spoken dialog system, a developer should take an effort to edit corpus and train each model separately. To reduce a cost for editing corpus and training each model, we need more systematic and efficient working environment. For the working environment, we propose DialogStudio as a spoken dialog system workbench.

  • PDF