• 제목/요약/키워드: Parts of Speech

검색결과 135건 처리시간 0.023초

W-CDMA 시스템을 위한 가변율 음성코덱 설계 (Design of a variable rate speech codec for the W-CDMA system)

  • 정우성
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
    • /
    • pp.142-147
    • /
    • 1998
  • Recently, 8 kb/s CS-ACELP coder of G.729 is atandardized by ITU-T SG15 and it has been reported that the speech quality of G729 is better than or equal to that of 32kb/s ADPCM. However G.729 is the fixed rate speech coder, and it does not consider the property of voice activity in mutual conversation. If we use the voice activity, we can reduce the average bit rate in half without any degradations of the speech quality. In this paper, we propose an efficient variable rate algorithm for G.729. The variable rate algorithm consists of two main subjects, the rate determination algorithm and algorithm, we combine the energy-thresholding method, the phonetic segmentation method by integration of various feature parameters obtained through the analysis procedure, and the variable hangover period method. Through the analysis of noise features, the 1 kb/s sub rate coder is designed for coding the background noise signal. So, we design the 4 kb/s sub rate coder for the unvoiced parts. The performance of the variable rate algorithm is evaluated by the comparison of speed quality and average bit rate with G.729. Subjective quality test is also done by MOS test. Conclusively, it is verified that the proposed variable rate CS-ACELP coder produced the same speech quality as G.729, at the average bit rate of 4.4 kb/s.

  • PDF

삼킴장애 진단과 치료에 대한 언어치료전공자의 인식 및 현황 (Perceptions on Evaluation and Treatment of Swallowing Disorders in Speech-Language Pathologists)

  • 윤지혜;이현정
    • 말소리와 음성과학
    • /
    • 제5권4호
    • /
    • pp.43-51
    • /
    • 2013
  • The purpose of this study is to survey Speech-Language Pathologists' perception on evaluation and treatment of "swallowing disorders". An online questionnaire was sent to the 279 subjects attending undergraduate/graduate programs in speech therapy department and/or SLPs who work in various settings. The survey consisted of three parts: 1) background information and educational/clinical experiences that are associated with dysphagia (swallowing disorder), 2) the current state of diagnosis and treatment of dysphagia of clinical practice (certified SLPs only), 3) the recognition of diagnosis, treatment, education for dysphagia. Each item of the survey was scaled by the participants on a five-point Likert scale of 1 to 5 (1 being not at all and 5 being extremely) or self-reported answers. The results of the survey showed that SLPs have high interest in "swallowing disorder", but most of them regarded them very difficult to diagnose and treat. The reason is that they have not been trained as a swallowing specialist. Therefore it is necessary to provide more opportunities for education and practice to establish the expertise of SLPs.

Single-Channel Non-Causal Speech Enhancement to Suppress Reverberation and Background Noise

  • Song, Myung-Suk;Kang, Hong-Goo
    • 한국음향학회지
    • /
    • 제31권8호
    • /
    • pp.487-506
    • /
    • 2012
  • This paper proposes a speech enhancement algorithm to improve the speech intelligibility by suppressing both reverberation and background noise. The algorithm adopts a non-causal single-channel minimum variance distortionless response (MVDR) filter to exploit an additional information that is included in the noisy-reverberant signals in subsequent frames. The noisy-reverberant signals are decomposed into the parts of the desired signal and the interference that is not correlated to the desired signal. Then, the filter equation is derived based on the MVDR criterion to minimize the residual interference without bringing speech distortion. The estimation of the correlation parameter, which plays an important role to determine the overall performance of the system, is mathematically derived based on the general statistical reverberation model. Furthermore, the practical implementation methods to estimate sub-parameters required to estimate the correlation parameter are developed. The efficiency of the proposed enhancement algorithm is verified by performance evaluation. From the results, the proposed algorithm achieves significant performance improvement in all studied conditions and shows the superiority especially for the severely noisy and strongly reverberant environment.

Conditional Random Fields를 이용한 영역 행위 분류 모델 (A Domain Action Classification Model Using Conditional Random Fields)

  • 김학수
    • 인지과학
    • /
    • 제18권1호
    • /
    • pp.1-14
    • /
    • 2007
  • 목적 지향 대화에서 사용자의 의도는 화행과 개념열의 쌍으로 구성된 영역 행위로 표현될 수 있다. 그러므로 지능적인 대화 시스템을 구성하기 위해서는 영역 행위를 정확히 파악하는 것이 매우 중요하다. 본 논문에서는 CRFs (Conditional Random Fields)를 이용하여 화행과 개념열을 동시에 결정하는 통계 모델을 제안한다. 편향 학습 문제를 피하기 위하여 제안한 모델은 어휘와 품사 같은 낮은 수준의 언어 자질을 입력 자질로 사용하며, 카이 제곱 통계량을 이용하여 불필요한 자질들을 제거한다. 일정 관리 영역에서 실험을 수행한 결과, 제안한 모델은 화행 분류 정착률에서 93.0%, 개념열 분류 정확률에서 90.2%의 좋은 성능을 보였다.

  • PDF

차량환경에서 음성명령어기 사용을 위한 음성개선방법 (Speech Enhancement for Voice commander in Car environment)

  • 백승권;한민수;남승현;이봉호;함영권
    • 방송공학회논문지
    • /
    • 제9권1호
    • /
    • pp.9-16
    • /
    • 2004
  • 본 논문에서는 차량용 음성명령어기의 사용을 위한 전처리 과정으로 음성개선 방법을 다룬다 특히 보다 주위 소음에 자유롭고 단말 조작에 있어 안정성을 보장하기 위하여 일반적 단일 마이크로폰으로 처리되는 잡음뿐만 아니라 음성명령어를 제외한 오디오 신호 등 비정적 통계적 특성을 갖는 소음들도 제거 될 수 있도록 음성개선 방법을 제안한다. 우리는 2개의 마이크로폰을 가지고 BSS 알고리즘을 적용하여 비정적 신호들을 분리하고, 분리된 신호에 대하여 Kalman 필터를 이용하여 시간상 단구간 정적 잡음을 제거한다. 인식 실험 결과를 통하여 공간적, 시간적 음성개선 방법이 순차적으로 적용될 때, 실제 차량 환경에서 음성 개선 알고리즘으로 적용될 수 있음을 보였다.

영어 청해력 신장에 따른 문제점과 개선 방향 (Problems and Suggestions of the English Listening Comprehension - Focused on Effective Teaching Methods -)

  • 이미재
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1997년도 7월 학술대회지
    • /
    • pp.81-91
    • /
    • 1997
  • This paper deals with the problems of English listening comprehension: the rate of understanding difference in positions and sentence structures, parts of speech easily missed to understand, English sounds only in English(not in Korean), confusion of sounds, unaccented prefixes and suffixes, polysemy, homonym, juncture, understanding as one word by two different words, and sound blending in a normal speed of connected speech. Bearing those in mind I taught Suwon University freshmen video English with the mixed idea of Peterson's bottom-up and top-down methods putting in a meaningful context with thought group rather than word to word understanding. As a consequence, their errors come: prepositions, conjunctions, unstressed prefixes and suffixes, -ing from the present progressives and so forth. Assignments to have students transcribe the TV commercials and the names of reporters or Korean related news from English broadcastings are of use and help.

  • PDF

한국어 음성 파형의 편집에 의한 한국어 음운 변화에 관한 연구 (A study on the phonemic feature changes according to Korean speech waveform edition)

  • 김선일;홍기원;이행세
    • 한국음향학회지
    • /
    • 제13권6호
    • /
    • pp.60-65
    • /
    • 1994
  • 한국어의 음성 파형의 일부분을 제거하거나, 일부분을 교체하므로서 얻어진 파형을 인간의 청각에 의한 음운 변별하는 실험을 수행하였다. 이 실험을 통해서 위치에 따라 급격한 음운 변화가 발생하는 천이구간, 가감해도 음운 변화가 극히 미세하여 음운 변화가 없는 구간, 그리고 상호 치환해도 변화를 가져오지 않는 등가 음운 구간 및 음가에 결정적 영향을 미치는 구간등을 확인하였다.

  • PDF

Acoustic Model-Based Filter Structure for Synthesizing Speech Signals

  • Lim, Il-Taek;Lee, Byeong-Gi
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1021-1026
    • /
    • 1994
  • This paper proposes a filter structure suitable for speech synthesis applications. We first derive the lossy pole-zero model by employing the wave digital filter(WDF) adaptor formula, and by converting the fixed termination value - 1 into a loss factor $\mu$c$\in$(-1, 1). Then we discuss how to determine the reflection We employ the Durbin's method in estimating the numerator polynomial of the lossy pole-zero transfer function from the given speech sound, and then apply the step-down algorithm on the numerator to extract the reflection coefficients of the closed-termination tract. For determining the reflection coefficients of the other parts we employ a pre-calculated pole-estimator polynomial.

  • PDF

음성명령기반 26관절 보행로봇 실시간 작업동작제어에 관한 연구 (A Study on Real-Time Walking Action Control of Biped Robot with Twenty Six Joints Based on Voice Command)

  • 조상영;김민성;양준석;구영목;정양근;한성현
    • 제어로봇시스템학회논문지
    • /
    • 제22권4호
    • /
    • pp.293-300
    • /
    • 2016
  • The Voice recognition is one of convenient methods to communicate between human and robots. This study proposes a speech recognition method using speech recognizers based on Hidden Markov Model (HMM) with a combination of techniques to enhance a biped robot control. In the past, Artificial Neural Networks (ANN) and Dynamic Time Wrapping (DTW) were used, however, currently they are less commonly applied to speech recognition systems. This Research confirms that the HMM, an accepted high-performance technique, can be successfully employed to model speech signals. High recognition accuracy can be obtained by using HMMs. Apart from speech modeling techniques, multiple feature extraction methods have been studied to find speech stresses caused by emotions and the environment to improve speech recognition rates. The procedure consisted of 2 parts: one is recognizing robot commands using multiple HMM recognizers, and the other is sending recognized commands to control a robot. In this paper, a practical voice recognition system which can recognize a lot of task commands is proposed. The proposed system consists of a general purpose microprocessor and a useful voice recognition processor which can recognize a limited number of voice patterns. By simulation and experiment, it was illustrated the reliability of voice recognition rates for application of the manufacturing process.

프레임 신뢰도 가중에 의한 강인한 음성인식 (Frame Reliability Weighting for Robust Speech Recognition)

  • 조훈영;김락용;오영환
    • 한국음향학회지
    • /
    • 제21권3호
    • /
    • pp.323-329
    • /
    • 2002
  • 본 논문에서는 임의의 시점에서 발생하여 음성 신호의 일부분을 심하게 손상시키는 시간선택 잡음 (time-selective noise)을 보상하기 위한 프레임 신뢰도 가중 방법을 제안한다. 음성 프레임들은 서로 다른 정도의 신뢰도를 갖으며, 신뢰도는 프레임의 신호대잡음비 (signal-to-noise ratio)에 비례한다. 잡음이 일정한 경우에는 무음구간에서 획득한 잡음 정보를 이용하여 프레임의 신호대잡음비 추정이 용이하나, 시간선택 잡음은 잡음추정이 어렵다. 따라서, 본 연구에서는 프레임 신뢰도를 추정하기 위해 깨끗한 음성의 통계적 모델을 사용하였다. 제안한 MFR (model-based frame reliability) 방법은 탐조 모델의 평균 벡터열과 입력 MFCC (mel-frequency cepstral coefficient) 특징 벡터 열의 역변환에 의해 얻은 필터뱅크 에너지를 이용하여 프레임 신호대잡음비를 근사한다. 다양한 버스트 (burst) 잡음에 대한 인식 실험 결과, 제안한 방법은 프레임의 신뢰도를 효과적으로 나타낼 수 있었으며, 이 신뢰도를 우도 계산에서 가중치로 적용하여 인식 성능을 향상시킬 수 있었다.