• Title/Summary/Keyword: Digit recognition

Search Result 202, Processing Time 0.022 seconds

A Study on Digit Modeling for Korean Connected Digit Recognition (한국어 연결숫자인식을 위한 숫자 모델링에 관한 연구)

  • 김기성
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.293-297
    • /
    • 1998
  • 전화망에서의 연결 숫자 인식 시스템의 개발에 대한 내용을 다루며, 이 시스템에서 다양한 숫자 모델링 방법들을 구현하고 비겨하였다. Word 모델의 경우 문맥독립 whole-word 모델을 구현하였으며, sub-word 모델로는 triphone 모델과 불파음화 자음을 모음에 포함시킨 modified triphone 모델을 구현하였다. 그리고 tree-based clustering 방법을 sub-word 모델과 문맥종속 whole-word 모델에 적용하였다. 이와 같은 숫자모델들에 대해 연속 HMM을 이용하여 화자독립 연결숫자 인식 실험을 수행한 결과, 문맥종속 단어 모델이 문맥독립 단어 모델보다 우수한 성능을 나타냈으며, triphone 모델과 modified triphone 모델은 유사한 성능을 나타냈다. 특히 tree-based clustering 방법을 적용한 문맥종속 단어 모델이 4연 숫자열에 대해 99.8%의 단어 dsltlr률 및 99.1%의 숫자열 인식률로서 가장 우수한 성능을 나타내었다.

  • PDF

Performance Improvement of Connected Digit Recognition by Considering Phoneme Variations in Korean Digit. (한국어 숫자음에서의 음운변화를 고려한 연결숫자 인식의 성능향상)

  • Song Myung Gyu;Kim Hyung Soon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.105-108
    • /
    • 2001
  • 한국어 숫자는 각 숫자가 단음절로 이루어져 있으며, 연속적으로 발음될 때 인접 숫자들의 상호조음현상에 의해 각 숫자의 고유 발음이 변화하고, 또한 그 숫자들의 경계도 모호해지는 문제점이 있다. 한편 연속적인 숫자의 발성을 기대하는 인식시스템에 반하여 일부 사용자는 숫자들을 고려시켜서 발성하기도 한다. 이는 연결숫자의 음운현상만을 고려한 인식 시스템에서는 성능저하의 한 원인이 된다 본 논문에서는 연결숫자의 인식성능 향상을 위해서 한국어 숫자들의 음운 변화를 고려하여 변이음군을 정하였으며, 사용자의 여러 가지 발성형태에 따른 다양한 음운 현상의 변화를 흡수 할 수 있도록 인식 네트웍을 구성하는 방식을 검토하였다. 전화망 4연숫자음을 이용한 화자독립 인식실험을 통해서 한국어 숫자에서 자주 오인식 되는 '이', '오', '일' 인식 성능이 각각 $4..2\%$, $4.2\%$, $2.9\%$씩 향상되었으며, 인식속도도 $33\%$의 개선이 있었다

  • PDF

Recognition of numeral stings with broken digits (획의 일부분이 손상된 숫자가 포함된 필기체 숫자 열의 인식)

  • Kim, Kye-Kyung;Kim, Jin-Ho;Cho, Soo-Hyun;Chi, Soo-Young;Chung, Yun-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2001.10a
    • /
    • pp.503-506
    • /
    • 2001
  • 본 논문에서는 획의 일부분이 손상된 숫자(broken digit)나 붙은 숫자(touching digits)와 같은 비정형 숫자들이 포함된 필기체 숫자 열을 인식할 수 있는 방법에 대하여 제안하였다. 비정형 숫자들은 분류(pre-segmentation) 단계에서 숫자들의 구조적인 특징 정보를 이용하여 정형인 개별 숫자(isolated digit)로부터 획의 일부분이 손상된 숫자 또는 붙은 숫자들로 분류된다. 획의 일부분이 분리된 숫자의 결합 및 붙은 숫자들의 분할 단계를 거쳐 인식을 시도하였다. 제안된 방법의 타당성을 증명하기 위하여 NIST SDl9 데이터베이스를 이용하여 시뮬레이션 해 보았다.

  • PDF

Isolated Digit Recognition Combined with Recurrent Neural Prediction Models and Chaotic Neural Networks (회귀예측 신경모델과 카오스 신경회로망을 결합한 고립 숫자음 인식)

  • Kim, Seok-Hyun;Ryeo, Ji-Hwan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.6
    • /
    • pp.129-135
    • /
    • 1998
  • In this paper, the recognition rate of isolated digits has been improved using the multiple neural networks combined with chaotic recurrent neural networks and MLP. Generally, the recognition rate has been increased from 1.2% to 2.5%. The experiments tell that the recognition rate is increased because MLP and CRNN(chaotic recurrent neural network) compensate for each other. Besides this, the chaotic dynamic properties have helped more in speech recognition. The best recognition rate is when the algorithm combined with MLP and chaotic multiple recurrent neural network has been used. However, in the respect of simple algorithm and reliability, the multiple neural networks combined with MLP and chaotic single recurrent neural networks have better properties. Largely, MLP has very good recognition rate in korean digits "il", "oh", while the chaotic recurrent neural network has best recognition in "young", "sam", "chil".

  • PDF

A Study on Korean 4-connected Digit Recognition Using Demi-syllable Context-dependent Models (반음절 문맥종속 모델을 이용한 한국어 4 연숫자음 인식에 관한 연구)

  • 이기영;최성호;이호영;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.3
    • /
    • pp.175-181
    • /
    • 2003
  • Because a word of Korean digits is a syllable and deeply coarticulatied in connected digits, some recognition models based on demisyllables have been proposed by researchers. However, they could not show an excellent recognition results yet. This paper proposes a recognition model based on extended and context-dependent demisyllables, such as a tri-demisyllable like a tri-phone, for the Korean 4-connected digits recognition. For experiments, we use a toolkit of HTK 3.0 for building this model of continuous HMMs using training Korean connected digits from SiTEC database and for recognizing unknown ones. The results show that the recognition rate is 92% and this model has an ability to improve the recognition performance of Korean connected digits.

A Biological Fuzzy Multilayer Perceptron Algorithm

  • Kim, Kwang-Baek;Seo, Chang-Jin;Yang, Hwang-Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.104-108
    • /
    • 2003
  • A biologically inspired fuzzy multilayer perceptron is proposed in this paper. The proposed algorithm is established under consideration of biological neuronal structure as well as fuzzy logic operation. We applied this suggested learning algorithm to benchmark problem in neural network such as exclusive OR and 3-bit parity, and to digit image recognition problems. For the comparison between the existing and proposed neural networks, the convergence speed is measured. The result of our simulation indicates that the convergence speed of the proposed learning algorithm is much faster than that of conventional backpropagation algorithm. Furthermore, in the image recognition task, the recognition rate of our learning algorithm is higher than of conventional backpropagation algorithm.

A Study on the Speech Recognition using Advanced Competitive Learning (개선된 경쟁학습을 이용한 음성인식)

  • Song, Joon-Gyu;Lee, Dong-Wook;Kim, Young-T.
    • Proceedings of the KIEE Conference
    • /
    • 1997.11a
    • /
    • pp.594-596
    • /
    • 1997
  • This paper presents the speaker-dependent Korean isolated digit recognition system using advanced competitive learning. Since competitive learning algorithms are easy and simple to implement, they are used in various fields. The proposed recognition algorithm consists of three procedures: comparing winning number of codebook vectors, selecting the representative vector out of codebook vectors, and generating a new codebook with the representative vectors. In this paper, we use a sound blaster 16 for obtaining speech data. Speech data are sampled by 16 bits and 11 kHz sampling rate.

  • PDF

Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech (한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석)

  • 정성윤;손종목;김민성;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.191-194
    • /
    • 2002
  • The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.

  • PDF

Boltzmann machine using Stochastic Computation (확률 연산을 이용한 볼츠만 머신)

  • 이일완;채수익
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.6
    • /
    • pp.159-168
    • /
    • 1994
  • Stochastic computation is adopted to reduce the silicon area of the multipliers in implementing neural network in VLSI. In addition to this advantage, the stochastic computation has inherent random errors which is required for implementing Boltzmann machine. This random noise is useful for the simulated annealing which is employed to achieve the global minimum for the Boltzmann Machine. In this paper, we propose a method to implement the Boltzmann machine with stochastic computation and discuss the addition problem in stochastic computation and its simulated annealing in detail. According to this analysis Boltzmann machine using stochastic computation is suitable for the pattern recognition/completion problems. We have verified these results through the simulations for XOR, full adder and digit recognition problems, which are typical of the pattern recognition/completion problems.

  • PDF

Korean continuous digit speech recognition by multilayer perceptron using KL transformation (KL 변환을 이용한 multilayer perceptron에 의한 한국어 연속 숫자음 인식)

  • 박정선;권장우;권정상;이응혁;홍승홍
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.8
    • /
    • pp.105-113
    • /
    • 1996
  • In this paper, a new korean digita speech recognition technique was proposed using muktolayer perceptron (MLP). In spite of its weakness in dynamic signal recognition, MLP was adapted for this model, cecause korean syllable could give static features. It is so simle in its structure and fast in its computing that MLP was used to the suggested system. MLP's input vectors was transformed using karhunen-loeve transformation (KLT), which compress signal successfully without losin gits separateness, but its physical properties is changed. Because the suggested technique could extract static features while it is not affected from the changes of syllable lengths, it is effectively useful for korean numeric recognition system. Without decreasing classification rates, we can save the time and memory size for computation using KLT. The proposed feature extraction technique extracts same size of features form the tow same parts, front and end of a syllable. This technique makes frames, where features are extracted, using unique size of windows. It could be applied for continuous speech recognition that was not easy for the normal neural network recognition system.

  • PDF