• 제목/요약/키워드: Digit recognition

검색결과 202건 처리시간 0.032초

필터뱅크를 이용한 한국어 숫자음 인식 다이얼링 시스템 (Korean Digit Speech Recognition Dialing System using Filter Bank)

  • 박기영;최형기;김종교
    • 대한전자공학회논문지TE
    • /
    • 제37권5호
    • /
    • pp.62-70
    • /
    • 2000
  • 본 논문은 한국어 숫자음 인식을 HMM과 DTW 프로그램을 사용한 필터 뱅크로 수행하였다. 스펙트럼 분석은 주로 성도의 모양에 의한 음성 신호 특징을 나타낸다. 그리고 음성의 스펙트럼 특징은 일반적으로 정의된 주파수 범위에서 적절하게 집중된 스펙트럼, 즉 필터뱅크를 통과해 나가는 것에 의해 얻을 수 있다. 또한 8 개의 밴드 패스 필터는 인간 귀의 지각적인 청취력에 의해 나누었다. 정의된 주파수 범위는 320-330, 450-460, 640-650, 840-850, 900-1000, 1100-1200, 2000-2100, 3900-4000㎐이고, 샘플링 주파수는 8㎑ 이다. 그리고 프레임 폭은 20㎳, 주기는 10㎳이다. 실험 결과는 한국어 숫자음 음성인식에 대해 필터 뱅크를 사용하는 경우 HMM보다 DTW의 인식율이 더 높은 인식율이 나오는 것을 확인 할 수가 있었다. 필터 뱅크를 이용한 한국어 숫자음 인식율은 24차 밴드패스필터에서 93.3%, 16차 밴드패스필터에서, 89.1%, 8차 밴드 패스필터의 하드웨어 음성 다이얼링 시스템에서 88.9%의 인식율을 나타내었다.

  • PDF

채널보상기법 및 특징파라미터에 따른 한국어 연속숫자음 전화음성의 인식성능 비교 (Comparison of the recognition performance of Korean connected digit telephone speech depending on channel compensation methods and feature parameters)

  • 정성윤;김민성;손종목;배건성;김상훈
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2002년도 11월 학술대회지
    • /
    • pp.201-204
    • /
    • 2002
  • As a preliminary study for improving recognition performance of the connected digit telephone speech, we investigate feature parameters as well as channel compensation methods of telephone speech. The CMN and RTCN are examined for telephone channel compensation, and the MFCC, DWFBA, SSC and their delta-features are examined as feature parameters. Recognition experiments with database we collected show that in feature level DWFBA is better than MFCC and for channel compensation RTCN is better than CMN. The DWFBA+Delta_ Mel-SSC feature shows the highest recognition rate.

  • PDF

Robust Speech Detection Based on Useful Bands for Continuous Digit Speech over Telephone Networks

  • Ji, Mi-Kyongi;Suh, Young-Joo;Kim, Hoi-Rin;Kim, Sang-Hun
    • The Journal of the Acoustical Society of Korea
    • /
    • 제22권3E호
    • /
    • pp.113-123
    • /
    • 2003
  • One of the most important problems in speech recognition is to detect the presence of speech in adverse environments. In other words, the accurate detection of speech boundary is critical to the performance of speech recognition. Furthermore the speech detection problem becomes severer when recognition systems are used over the telephone network, especially wireless network and noisy environment. Therefore this paper describes various speech detection algorithms for continuous digit recognition system used over wire/wireless telephone networks and we propose a algorithm in order to improve the robustness of speech detection using useful band selection under noisy telephone networks. In this paper, we compare some speech detection algorithms with the proposed one, and present experimental results done with various SNRs. The results show that the new algorithm outperforms the other speech detection methods.

음소경계 정보를 이용한 한국어 숫자음 인식에 관한 연구 (A Study on Korean Digit Recognition by Using Phoneme Boundary Information)

  • 최관묵;임동철;이행세
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2001년도 추계학술발표대회 논문집 제20권 2호
    • /
    • pp.117-120
    • /
    • 2001
  • Recognition rate of Korean digit is lower than that of other words because it is composed of similar phonemes. In this paper, a new method is proposed for the improvement of recognition rate by using the phoneme boundary information. In addition, the proposed method rarely increase cost because phoneme boundary is found by using simple method. We experimented with speech data of one man and then obtained results of enhanced speech recognition rate.

  • PDF

PDA상에서 음성인식을 이용한 차량번호 조회시스템 (A car number retrieving system using speech recognition for PDA)

  • 김우성;김동환;윤재선;홍광석
    • 융합신호처리학회 학술대회논문집
    • /
    • 한국신호처리시스템학회 2001년도 하계 학술대회 논문집(KISPS SUMMER CONFERENCE 2001
    • /
    • pp.281-284
    • /
    • 2001
  • 본 논문에서는 PDA상에서 음성인식과 합성을 통하여 차량 번호를 조회할 수 있는 시스템을 구현하였다. 차량번호 인식을 위한 4연속 숫자음과 명령어 인식부분, 그리고 각 단계별로 합성된 음성을 들려주도록 구성하였다. 본 연구의 인식시스템은 화자독립으로 실험을 하였으며, 여러화자에 대한 4연속 차량번호 인식률과 명령어에 대한 인식률은 각각 97%, 99%가 나왔다.

  • PDF

스펙트럼사상학습을 이용한 잡음환경에서의 한국어숫자음인식 (Korean Digit Recognition Under Noise Environment Using Spectral Mapping Training)

  • 이기영
    • 한국음향학회지
    • /
    • 제13권3호
    • /
    • pp.25-32
    • /
    • 1994
  • 본 연구에서는 정적지도적응알고리즘을 기초로 한 스펙트럼사상학습을 이용하여 잡음환경에서의 한국어숫자음인식방법을 제시하였다. 제시한 인식방법에서 잡음이 섞인 음성스펙트럼 공간을 잡음이 없는 음성스펙트럼 공간으로 사상한 결과, 잡음이 섞인 음성스펙트럼의 왜곡이 개선되어 잡음처리를 행하지 않은 기존의 VQ(vector quantizaton)와 DTW(dynamic time warping)를 이용한 방법보다 높은 인식율을 얻을 수 있었으며 , 0 dB의 SNR 레벨에서도 기존방법의 인식율을 10배 정도 향상시키므로써, 스펙트럼사상학습이 잡음환경의 음성에 대한 인식성능을 향상시킬 수 있는 방법임을 확인하였다.

  • PDF

Chip 구현을 위한 IDMLP 신경 회로망의 개발과 음성인식에 대한 응용 (The Development of IDMLP Neural Network for the Chip Implementation and it's Application to Speech Recognition)

  • 김신진;박정운;정호선
    • 전자공학회논문지B
    • /
    • 제28B권5호
    • /
    • pp.394-403
    • /
    • 1991
  • This paper described the development of input driven multilayer perceptron(IDMLP) neural network and it's application to the Korean spoken digit recognition. The IDMPLP neural network used here and the learning algorithm for this network was proposed newly. In this model, weight value is integer and transfer function in the neuron is hard limit function. According to the result of the network learning for the some kinds of input data, the number of network layers is one or more by the difficulties of classifying the inputs. We tested the recognition of binaried data for the spoken digit 0 to 9 by means of the proposed network. The experimental results are 100% and 96% for the learning data and test data, respectively.

  • PDF

필기체 숫자인식을 위한 병렬 자구성 계층 신경회로망 (Parallel, self-organizing, hierarchical neural networks for handwritten digit recognition)

  • 방극준;조남신;강창언;홍대식
    • 전자공학회논문지B
    • /
    • 제33B권7호
    • /
    • pp.173-182
    • /
    • 1996
  • In this paper, we propose the parallel, self-organizing, hierarchical neural netowrks as a handwritten digit recognition system. This system can absorb the various shape variations of handwritten digits by using the different methods of extracting the features in each stage neural network (SNN) of the PSHNN, and can reduce training time by using the single layer neural network as the SNN, and can obtain high rate of correct recognition by using the certainty area in all the output nodes individually. experiments have been performed with NIST database. In which we use 21, 315 digits (10, 625 digits for training and 10,663 digits for testing). The results show that the correct rate is 97.48% the error rate is 1.72% and the reject rate is 0.78%.

  • PDF

2단 회귀신경망의 숫자음 인식에관한 연구 (A study on the spoken digit recognition performance of the Two-Stage recurrent neural network)

  • 안점영
    • 한국통신학회논문지
    • /
    • 제25권3B호
    • /
    • pp.565-569
    • /
    • 2000
  • We compose the two-stage recurrent neural network that returns both signals of a hidden and an output layer to the hidden layer. It is tested on the basis of syllables for Korean spoken digit from /gong/to /gu. For these experiments, we adjust the neuron number of the hidden layer, the predictive order of input data and self-recurrent coefficient of the decision state layer. By the experimental results, the recognition rate of this neural network is between 91% and 97.5% in the speaker-dependent case and between 80.75% and 92% in the speaker-independent case. In the speaker-dependent case, this network shows an equivalent recognition performance to Jordan and Elman network but in the speaker-independent case, it does improved performance.

  • PDF

신경망 회로를 이용한 필기체 숫자 인식에 관할 연구 (A Study Of Handwritten Digit Recognition By Neural Network Trained With The Back-Propagation Algorithm Using Generalized Delta Rule)

  • 이규한;정진현
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1999년도 하계학술대회 논문집 G
    • /
    • pp.2932-2934
    • /
    • 1999
  • In this paper, a scheme for recognition of handwritten digits using a multilayer neural network trained with the back-propagation algorithm using generalized delta rule is proposed. The neural network is trained with hand written digit data of different writers and different styles. One of the purpose of the work with neural networks is the minimization of the mean square error(MSE) between actual output and desired one. The back-propagation algorithm is an efficient and very classical method. The back-propagation algorithm for training the weights in a multilayer net uses the steepest descent minimization procedure and the sigmoid threshold function. As an error rate is reduced, recognition rate is improved. Therefore we propose a method that is reduced an error rate.

  • PDF