• Title/Summary/Keyword: vowel recognition

Search Result 135, Processing Time 0.027 seconds

A Study on Speech Period and Pitch Detection for Continuous Speech Recognition (연속음성인식을 위한 음성구간과 피치검출에 관한 연구)

  • Kim Tai Suk;Chang jong chil
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.1
    • /
    • pp.56-61
    • /
    • 2005
  • In this thesis, propose speech period and pitch detection for continuous speech recognition. This mathod is distinguishes between vowel and consonant to frame unit in continuous speech, for distinguishable voice. Powerful extraction of speech period could threshold energy make use of input signal to real noise environment. Also algorithm of this method distinguish between vowel and consonant at the same time in voice make use of zero crossing rate and short time energy to extractible speech period.

  • PDF

Vowel Fundamental Frequency in Manner Differentiation of Korean Stops and Affricates

  • Jang, Tae-Yeoub
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.217-232
    • /
    • 2000
  • In this study, I investigate the role of post-consonantal fundamental frequency (F0) as a cue for automatic distinction of types of Korean stops and affricates. Rather than examining data obtained by restricting contexts to a minimum to prevent the interference of irrelevant factors, a relatively natural speaker independent speech corpus is analysed. Automatic and statistical approaches are adopted to annotate data, to minimise speaker variability, and to evaluate the results. In spite of possible loss of information during those automatic analyses, statistics obtained suggest that vowel F0 is a useful cue for distinguishing manners of articulation of Korean non-continuant obstruents having the same place of articulation, especially of lax and aspirated stops and affricates. On the basis of the statistics, automatic classification is attempted over the relevant consonants in a specific context where the micro-prosodic effects appear to be maximised. The results confirm the usefulness of this effect in application for Korean phone recognition.

  • PDF

Speech Production and Perception of Word-medial Singleton and Geminate Sonorants in Korean (한국어 어중 공명 중첩자음과 단자음의 조음 및 지각)

  • Kim, Taekyung
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.145-155
    • /
    • 2013
  • This study investigated the articulatory characteristics of Korean singleton and geminate sonorants in the word-medial position, effects of the duration of the sonorant consonant and the preceding vowel on perception, and the difference between native Korean speakers and foreign learners of Korean in perceiving the singleton and geminate consonant contrast. The Korean sonorant consonants(/m, n, l/) are examined from the VCCV, VCV sequences through speech production and perception experiments. The results suggest that the duration of the sonorant consonant is the most important factor for native Korean speakers to recognize whether sonorants are overlapped, and the duration of preceding vowel and other factors affect the recognition of singleton/geminate consonant contrast if the duration is not obvious. A perception experiment showed Chinese Korean language learners did not clearly distinguish singleton consonants from geminate consonants. The results of this study provide basic data for recognition of singleton/geminate consonant contrast in word-medial of Korean language, and can be utilized for teaching Korean pronunciation as a foreign language.

A License Plate Recognition Algorithm using Multi-Stage Neural Network for Automobile Black-Box Image (다단계 신경 회로망을 이용한 블랙박스 영상용 차량 번호판 인식 알고리즘)

  • Kim, Jin-young;Heo, Seo-weon;Lim, Jong-tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.40-48
    • /
    • 2018
  • This paper proposes a license-plate recognition algorithm for automobile black-box image which is obtained from the camera moving with the automobile. The algorithm intends to increase the overall recognition-rate of the license-plate by increasing the Korean character recognition-rate using multi-stage neural network for automobile black-box image where there are many movements of the camera and variations of light intensity. The proposed algorithm separately recognizes the vowel and consonant of Korean characters of automobile license-plate. First, the first-stage neural network recognizes the vowels, and the recognized vowels are classified as vertical-vowels('ㅏ','ㅓ') and horizontal-vowels('ㅗ','ㅜ'). Then the consonant is classified by the second-stage neural networks for each vowel group. The simulation for automobile license-plate recognition is performed for the image obtained by a real black-box system, and the simulation results show the proposed algorithm provides the higher recognition-rate than the existing algorithms using a neural network.

Isolated Words Recognition using Correlation VQ-HMM (상관성있는 VQ-HMM을 이용한 고립 단어 인식)

  • 이진수
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.109-112
    • /
    • 1993
  • In this paper, we propose the modified VQ, applied correlation between codewords in order to reduce the error rate due to personal and speakers' temporal variation. Such a modified VQ is used in the stage of preprocessing of HMM and the temporal variation is absorbed by nonlinear Decimation and Interpolation of vowel part that we obtain higher recognition rate than not so case. The objects of experiment are Korea 142 DDD regional names and we show that the proposed method increase the recognition rate.

  • PDF

Development of an Algorithm for Korean Letter Recognition using Letter Component Analysis (조합형 문자구성을 이용한 문서 인식 알고리즘)

  • 김영재;이호재;김희식
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.427-430
    • /
    • 1995
  • This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of syllable area from input character image, then it makes recognition of a consonant and a vowel in the character. A precision segmentation is very important to recognize the input character. The input image has 8-bit gray scaled resolution. Not only the shape but also vertical and horizontal lines dispersion graph are used for segmentation. Theresult shows a higher accuracy of character segmentation.

  • PDF

Speech Recognition for Vowel Detection using by Cepstrum Coefficients (켑스트럼 계수에 의한 모음검출을 위한 음성인식)

  • Choi, Jae-Seung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2011.10a
    • /
    • pp.613-615
    • /
    • 2011
  • 본 논문에서는 켑스트럼 계수를 이용하여 음성인식을 하는 알고리즘을 제안한다. 본 논문에서 제안하는 방법은 사람이 발성한 음성을 두 영역의 켑스트럼 계수로 분리한 후에, 신경회로망을 사용하여 음성인식을 하는 방법이다. 본 논문에서 제안하는 신경회로망은 오차가 거의 없어지는 일정 기간 동안 네트워크를 학습시킨 후에 신경회로망의 학습 데이터와는 다른 새로운 음성이 신경회로망에 입력된 경우에 대하여 각 음성 구간에서 분류가 가능한 모음검출을 위한 음성인식 시스템을 제안한다.

  • PDF

ACOUSTIC FEATURES DIFFERENTIATING KOREAN MEDIAL LAX AND TENSE STOPS

  • Shin, Ji-Hye
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.53-69
    • /
    • 1996
  • Much research has been done on the rues differentiating the three Korean stops in word initial position. This paper focuses on a more neglected area: the acoustic cues differentiating the medial tense and lax unaspirated stops. Eight adult Korean native speakers, four males and four females, pronounced sixteen minimal pairs containing the two series of medial stops with different preceding vowel qualities. The average duration of vowels before lax stops is 31 msec longer than before their tense counterparts (70 msec for lax vs 39 msec for tense). In addition, the average duration of the stop closure of tense stops is 135 msec longer than that of lax stops (69 msec for lax vs 204msec for tense). THESE DURATIONAL DIFFERENCES ARE 50 LARGE THAT THEY MAY BE PHONOLOGICALLY DETERMINED, NOT PHONETICALLY. Moreover, vowel duration varies with the speaker's sex. Female speakers have 5 msec shorter vowel duration before both stops. The quality of voicing, tense or lax, is also a cue to these two stop types, as it is in initial position, but the relative duration of the stops appears to be much more important cues. The duration of stops changes the stop perception while that of preceding vowel does not. The consequences of these results for the phonological description of Korean as well as the synthesis and automatic recognition of Korean will be discussed.

  • PDF

Vowel Classification of Imagined Speech in an Electroencephalogram using the Deep Belief Network (Deep Belief Network를 이용한 뇌파의 음성 상상 모음 분류)

  • Lee, Tae-Ju;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.1
    • /
    • pp.59-64
    • /
    • 2015
  • In this paper, we found the usefulness of the deep belief network (DBN) in the fields of brain-computer interface (BCI), especially in relation to imagined speech. In recent years, the growth of interest in the BCI field has led to the development of a number of useful applications, such as robot control, game interfaces, exoskeleton limbs, and so on. However, while imagined speech, which could be used for communication or military purpose devices, is one of the most exciting BCI applications, there are some problems in implementing the system. In the previous paper, we already handled some of the issues of imagined speech when using the International Phonetic Alphabet (IPA), although it required complementation for multi class classification problems. In view of this point, this paper could provide a suitable solution for vowel classification for imagined speech. We used the DBN algorithm, which is known as a deep learning algorithm for multi-class vowel classification, and selected four vowel pronunciations:, /a/, /i/, /o/, /u/ from IPA. For the experiment, we obtained the required 32 channel raw electroencephalogram (EEG) data from three male subjects, and electrodes were placed on the scalp of the frontal lobe and both temporal lobes which are related to thinking and verbal function. Eigenvalues of the covariance matrix of the EEG data were used as the feature vector of each vowel. In the analysis, we provided the classification results of the back propagation artificial neural network (BP-ANN) for making a comparison with DBN. As a result, the classification results from the BP-ANN were 52.04%, and the DBN was 87.96%. This means the DBN showed 35.92% better classification results in multi class imagined speech classification. In addition, the DBN spent much less time in whole computation time. In conclusion, the DBN algorithm is efficient in BCI system implementation.

Synthesis of Multiplexed MACE Filter for Optical Korean Character Recognition (인쇄체 한글의 광학적 인식을 위한 다중 MACE 필터의 합성)

  • 김정우;김철수;배장근;도양회;김수중
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.12
    • /
    • pp.2364-2375
    • /
    • 1994
  • For the efficient recognition of printed Korean characters, a multiplexed minimum average correlation energy(MMACE) filter is proposed. Proposed method solved the disadvantages of the tree structure algorithm which recognition system is very huge and recognition method is sophisticated. Using only one consonant MMACE filter and one vowel one, we recognized the full Korean character. Each MMACE filter is multiplexed by 4 K-tuple MACE filters which are synthesized by 24 consonants and vowels. Hence the proposed MMACE filter and the correlation distribution plane are divided by 4 subregion. We obtained the binary codes for the Korean character recognition from each correlation distribution subplane. And the obtained codes are compared with the truth table for consonants and vowels in computer. We can recognize the full Korean characters when substitute the corresponded consonant or vowel font of the consistent code to the correlation peak place in the output correlation plane. The computer simulation and optical experiment results show that the proposed compact Korean character recognition system using the MMACE filters has high discrimination capability.

  • PDF