• Title/Summary/Keyword: Speech Processing

Search Result 956, Processing Time 0.028 seconds

Performance Assessment of Several Established Pitch Detection Algorithms in Voices of Benign Vocal Fold Lesions (양성후두 질환 음성에 대한 여러 기존 피치검출 알고리즘의 성능 평가)

  • Jang, Seung-Jin;Choi, Seong-Hee;Kim, Hyo-Min;Choi, Hong-Shik;Yoon, Young-Ro
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.407-408
    • /
    • 2007
  • Robust pitch estimation is an important study in many areas of speech processing. In voice pathology, diverse statistics extracted form pitch were commonly used to test voice quality. In this study, we compared several established pitch detection algorithms (PDAs) for verification of adequacy of the PDAs. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices such as benign vocal fold lesions; polyp, nodule, and cysts. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.

  • PDF

A Study on a Part of Speech for Korean Natural Language Processing (한국어 처리를 위한 품사 체계 연구)

  • Ahn, Mi-Jung;Kim, Jae-Han;Okcy, Cheol-Young
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.581-592
    • /
    • 1993
  • 지금까지의 한국어 자연언어 처리에 기반이 되는 사전 품사 체계에 대한 연구는 형태소 분석, 구문 구조 분석, 그리고 의미 분석 등의 다양한 분야에서 이루어져 왔다. 한국어 자연언어 처리 각 분야는 자체의 고유한 독립성을 가지는데, 이러한 특성은 사전 품사 체계의 다양화를 초래하였으며, 연계성있는 자연언어 처리를 위한 통합 환경 조성을 저해시켜 왔다. 본 논문에서는 한국어 자연언어 처리 전반에 걸친 통합 환경 조성을 위한 범용적인 사전 품사체계의 필요성에 따라 한국어 자연언어 분석의 각 분야에 적합한 사전 품사체계에 대하여 살펴 본 후, 한국어 자연 언어 처리 전반에 사용될 범용적이고 통합적인 기본 사전 품사체계 구축을 위한 방안을 제시한다.

  • PDF

A study on pitch detection for RUI emotion classification based on voice (RUI용 음성신호기반의 감정분류를 위한 피치검출기에 관한 연구)

  • Byun, Sung-Woo;Lee, Seok-Pil
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.421-424
    • /
    • 2015
  • 컴퓨터 기술이 발전하고 컴퓨터 사용이 일반화 되면서 휴먼 인터페이스에 대한 많은 연구들이 진행되어 왔다. 휴먼 인터페이스에서 감정을 인식하는 기술은 컴퓨터와 사람간의 상호작용을 위해 중요한 기술이다. 감정을 인식하는 기술에서 분류 정확도를 높이기 위해 특징벡터를 정확하게 추출하는 것이 중요하다. 본 논문에서는 정확한 피치검출을 위하여 음성신호에서 음성 구간과 비 음성구간을 추출하였으며, Speech Processing 분야에서 사용되는 전 처리 기법인 저역 필터와 유성음 추출 기법, 후처리 기법인 Smoothing 기법을 사용하여 피치 검출을 수행하고 비교하였다. 그 결과, 전 처리 기법인 유성음 추출 기법과 후처리 기법인 Smoothing 기법은 피치 검출의 정확도를 높였고, 저역 필터를 사용한 경우는 피치 검출의 정확도가 떨어트렸다.

  • PDF

The implementation of the Language-Study-Headphone storng to Noise Environment (소음 환경에서 강인한 어학용 헤드폰 구현)

  • Son, Jae-Hyeak;Shin, Jae-Ho
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 2005.08a
    • /
    • pp.397-405
    • /
    • 2005
  • This paper presents a headphone system which has adopted two algorithm to increase sound clearness and to separate signal from noisy environment. In the field of adaptive signal processing, LMS algorithm which is a kind of steepest decent method, can be implemented with more simple calculation, so that we use it to eliminate unwanted noise elements for the proposed system. Futhermore we generate early echo using some delays, then mix it in signal. This process can increase the clearness of signal. In this paper, we prove that the proposed system can be implemented in real time. The proposed system is satisfied to subject assessment test base on MOS(Mean Opinion Score) of ITU-T.

  • PDF

A Fixed-Point Error Analysis of fast DCT Algorithms (고정 소수점 연산에 의한 고속 DCT 알고리듬의 오차해석)

  • 연일동;이상욱
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.40 no.4
    • /
    • pp.331-341
    • /
    • 1991
  • The discrete cosine transform (DCT) is widely used in many signal processing areas, including image and speech data compression. In this paper, we investigate a fixed-point error analysis for fast DCT algorithms, namely, Lee [6], Hou [7] and Vetterli [8]. A statistical model for fixed-point error is analyzed to predict the output noise due to the fixed-point implementation. This paper deals with two's complement fixed-point data representation with truncation and rounding. For a comparison purpose, we also investigate the direct form DCT algorithm. We also propose a suitable scaling model for the fixed-point implementation to avoid an overflow occurring in the addition operation. Computer simulation results reveal that there is a close agreement between the theoretical and the experimental results. The result shows that Vetterli's algorithm is better than the other algorithms in terms of SNR.

  • PDF

Post-Processing of Speech Recognition Using User Utterance Sequential Pattern (사용자 발화 순차패턴을 이용한 음성인식 후처리)

  • Song, Won-Moon;Kim, Eun-Ju;Kim, Myung-Won
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.709-711
    • /
    • 2005
  • 최근 음성인식 분야에서는 발화된 음성의 단순한 신호 처리위주의 인식 결과로부터 좀 더 신뢰할 수 있는 결과를 얻기 위하여 여러 가지 후처리 기법들이 연구되고 있다. 본 논문에서는 개인 사용자를 위한 음성 명령어 인식 환경에서 사용자의 발화 정보를 후처리에 적용함으로써 사용자 정보를 고려한 음성인식 후처리 기법을 제안한다. 먼저 이전에 사용했던 음성 명령어들로부터 명령어 발화 순차 패턴 규칙을 추출 한 후 사용자가 사전에 발화한 명령어를 바탕으로 구성된 순차 패턴을 비교하여 순차 규칙상 얻어 질 수 있는 단어를 결정한다. 이렇게 얻어진 단어를 고려하여 음성인식기 인식단어 후보들의 확률값을 적절히 보정한 후 최종 인식 단어를 재결정한다. 이러한 과정에서 적절한 보정을 위하여 발화 순차 패턴의 신뢰도와 인식기의 결과단어를 고려한 보정 방법을 제안한다. 실험을 통하여 제안한 후처리를 이용한 음성인식이 HMM을 이용한 기본 음성인식에 비해 오류율을 $15\%$이상 낮추어 인식률에 상당한 기여를 하였음을 확인할 수 있다.

  • PDF

A Study on Recognition of Spoken Numbers Using Spatio-Tempora1 Pattern Recognizer (시공간 패턴인식 신경망에 의한 단어 인식에 관한 연구)

  • Park, Kyoung-Cheol;Kim, Hun-Kee;Lee, Chong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.495-497
    • /
    • 1993
  • This paper presents spoken numbers recognition method using a spatio-temporal network This network is efficient in processing the spectrum sequences of speech patterns as spatio-temporal patterns. The number of windows and channels is experimentally determined. The recognition rate has been improved by experiments done on various parameters. The test data is collected form 10 numbers spoken by 2 male and female speakers. A recognition rate of 80% was obtained on a test set of 50 words.

  • PDF

Design of Emulator using DSP Chip (DSP 칩을 이용한 에뮬레이터 설계)

  • Lee, Dae-Young;Lee, Jae-Hak;Kim, Jin-Min;Kim, Hyoun-Ho;Bae, Hyeon-Deok
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.453-455
    • /
    • 1993
  • In this research, the digital signal processing PC board which employs TI's TMS320C25 is implemented. The board can perform following functions. spectrum analysis of speech and repetitive signal, digital filters emulation by convolution, signal generation of sinusoidal wave, rectangular wave etc.. In this system, communications between PC and DSP board. program down-loading to DSP board and recording and graphic of acquired and processed data in DSP board are executed by PC. Parallel interface and buffer memory are used in communications. Data acquisition and operation are carried out in DSP board. Resultant data are transmitted to PC and output through DAC.

  • PDF

Enhancement of Source Localization Performance using Clustering Ranging Method (클러스터링 기법을 이용한 음원의 위치추정 성능향상)

  • Lee, Ho Jin;Yoon, Kyung Sik;Lee, Kyun Kyung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.19 no.1
    • /
    • pp.9-15
    • /
    • 2016
  • Source localization has developed in various fields of signal processing including radar, sonar, and wireless communication, etc. Source localization can be found by estimating the time difference of arrival between the each of sensors. Several methods like the NLS(Nonlinear Least Square) cost function have been proposed in order to improve the performance of time delay estimation. In this paper, we propose a clustering method using the four sensors with the same aperture as previous methods of using the three sensors. Clustering method can be improved the source localization performance by grouping similar estimated values. The performance of source localization using clustering method is evaluated by Monte Carlo simulation.

Efficient Language Model based on VCCV unit for Sentence Speech Recognition (문장음성인식을 위한 VCCV 기반의 효율적인 언어모델)

  • Park, Seon-Hui;No, Yong-Wan;Hong, Gwang-Seok
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.836-839
    • /
    • 2003
  • In this paper, we implement a language model by a bigram and evaluate proper smoothing technique for unit of low perplexity. Word, morpheme, clause units are widely used as a language processing unit of the language model. We propose VCCV units which have more small vocabulary than morpheme and clauses units. We compare the VCCV units with the clause and the morpheme units using the perplexity. The most common metric for evaluating a language model is the probability that the model assigns the derivative measures of perplexity. Smoothing used to estimate probabilities when there are insufficient data to estimate probabilities accurately. In this paper, we constructed the N-grams of the VCCV units with low perplexity and tested the language model using Katz, Witten-Bell, absolute, modified Kneser-Ney smoothing and so on. In the experiment results, the modified Kneser-Ney smoothing is tested proper smoothing technique for VCCV units.

  • PDF