• Title/Summary/Keyword: vowel recognition

Search Result 135, Processing Time 0.027 seconds

Selective Adaptation of Speaker Characteristics within a Subcluster Neural Network

  • Haskey, S.J.;Datta, S.
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.464-467
    • /
    • 1996
  • This paper aims to exploit inter/intra-speaker phoneme sub-class variations as criteria for adaptation in a phoneme recognition system based on a novel neural network architecture. Using a subcluster neural network design based on the One-Class-in-One-Network (OCON) feed forward subnets, similar to those proposed by Kung (2) and Jou (1), joined by a common front-end layer. the idea is to adapt only the neurons within the common front-end layer of the network. Consequently resulting in an adaptation which can be concentrated primarily on the speakers vocal characteristics. Since the adaptation occurs in an area common to all classes, convergence on a single class will improve the recognition of the remaining classes in the network. Results show that adaptation towards a phoneme, in the vowel sub-class, for speakers MDABO and MWBTO Improve the recognition of remaining vowel sub-class phonemes from the same speaker

  • PDF

Phonological Process and Word Recognition in Continuous Speech: Evidence from Coda-neutralization (음운 현상과 연속 발화에서의 단어 인지 - 종성중화 작용을 중심으로)

  • Kim, Sun-Mi;Nam, Ki-Chun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.17-25
    • /
    • 2010
  • This study explores whether Koreans exploit their native coda-neutralization process when recognizing words in Korean continuous speech. According to the phonological rules in Korean, coda-neutralization process must come before the liaison process, as long as the latter(i.e. liaison process) occurs between 'words', which results in liaison-consonants being coda-neutralized ones such as /b/, /d/, or /g/, rather than non-neutralized ones like /p/, /t/, /k/, /ʧ/, /ʤ/, or /s/. Consequently, if Korean listeners use their native coda-neutralization rules when processing speech input, word recognition will be hampered when non-neutralized consonants precede vowel-initial targets. Word-spotting and word-monitoring tasks were conducted in Experiment 1 and 2, respectively. In both experiments, listeners recognized words faster and more accurately when vowel-initial target words were preceded by coda-neutralized consonants than when preceded by coda non-neutralized ones. The results show that Korean listeners exploit the coda-neutralization process when processing their native spoken language.

  • PDF

Analyzing vowel variation in Korean dialects using phone recognition

  • Jooyoung Lee;Sunhee Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.101-107
    • /
    • 2023
  • This study aims to propose an automatic method of detecting vowel variation in the Korean dialects of Gyeong-sang and Jeol-la. The method is based on error patterns extracted using phone recognition. Canonical and recognized phone sequences are compared, and statistical analyses distinguish the vowels appearing in both dialects, the dialect-common vowels, and the vowels with high mismatch rates for each dialect. The dialect-common vowels show monophthongization of diphthongs. The vowels unique to the dialects are /we/ to [e] and /ʌ/ to [ɰ] for Gyeong-sang dialect, and /ɰi/ to [ɯ] in Jeol-la dialect. These results corroborate previous dialectology reports regarding phonetic realization of the Korean dialects. The current method provides a possibility of automatic explanation of the dialect patterns.

Implementation of Korean Vowel 'ㅏ' Recognition based on Common Feature Extraction of Waveform Sequence (파형 시퀀스의 공통 특징 추출 기반 모음 'ㅏ' 인식 구현)

  • Roh, Wonbin;Lee, Jongwoo
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.11
    • /
    • pp.567-572
    • /
    • 2014
  • In recent years, computing and networking technologies have been developed, and the communication equipments have become smaller and the mobility has increased. In addition, the demand for easily-operated speech recognition has increased. This paper proposes method of recognizing the Korean phoneme 'ㅏ'. A phoneme is the smallest unit of sound, and it plays a significant role in speech recognition. However, the precise recognition of the phonemes has many obstacles since it has many variations in its pronunciation. This paper proposes a simple and efficient method that can be used to recognize a Korean vowel 'ㅏ'. The proposed method is based on the common features that are extracted from the 'ㅏ' waveform sequences, and this is simpler than when using the previous complex methods. The experimental results indicate that this method has a more than 90 percent accuracy in recognizing 'ㅏ'.

Speaker Identification Based on Vowel Classification and Vector Quantization (모음 인식과 벡터 양자화를 이용한 화자 인식)

  • Lim, Chang-Heon;Lee, Hwang-Soo;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.4
    • /
    • pp.65-73
    • /
    • 1989
  • In this paper, we propose a text-independent speaker identification algorithm based on VQ(vector quantization) and vowel classification, and its performance is studied and compared with that of a conventional speaker identification algorithm using VQ. The proposed speaker identification algorithm is composed of three processes: vowel segmentation, vowel recognition and average distortion calculation. The vowel segmentation is performed automatlcally using RMS energy, BTR(Back-to-Total cavity volume Ratio)and SFBR(Signed Front-to-Back maximum area Ratio) extracted from input speech signal. If the Input speech signal Is noisy, particularity when the SNR is around 20dB, the proposed speaker identification algorithm performs better than the reference speaker identification algorithm when the correct vowel segmentation is done. The same result is obtained when we use the noisy telephone speech signal as an input, too.

  • PDF

A Study on VCCV Segmentation in Unrestricted Word Recognition System (무제한 단어인식 시스템을 위한 VCCV분할에 관한 연구)

  • Youn Jeh-Seon;Chung Kwang-Woo;Hong Kwang-Seok
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.103-106
    • /
    • 2000
  • 무제한 인식 시스템을 구현하기 위해서는 적절한 인식단위, 훈련 데이터 베이스의 확보, 인식단위의 분할, 인식 알고리즘과 같은 문제점을 모두 해결하여야 한다. 따라서 본 논문에서는 무제한 음성인식 시스템의 인식의 기본 단위로 모음의 안정구간을 검출하여 분할하는 CV(Consonant-Vowel), VC(Vowel-Consonant), VC CV(Vowel-Consonant-Consonant-Vowel)단위와 분할 파라미터를 제안하고, 분할 실험을 통해 그 유효성을 확인하고자 한다.

  • PDF

A Study On The Text Recognition Using Artificial Intelligence Technique (인공지능 기법을 이용한 텍스트 인식에 관한 연구)

  • 이행세;최태영;김영길;김정우
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.11
    • /
    • pp.1782-1793
    • /
    • 1989
  • Stroke crossing number, syntactic pattern recognition procedure, top down recognition structure, and heuristic approach are studied for the Korean text recognition. We propose new algorithms: 1)Korean vowel seperation using limited scanning method in the Korean characters, 2) extracting strokes using stroke width method, 3) stroke crossing number and its properties, 4) average, standard deviation, and mode of stroke crossing number, and 5) classification and recognition methods of limited chinese character. These are studied with computer simuladtions and experiments.

  • PDF

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition (음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구)

  • 임영춘;오세진;김광동;노덕규;송민규;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.5
    • /
    • pp.388-402
    • /
    • 2003
  • In this paper, we carried out the word, 4 continuous digits. continuous, and task-independent word recognition experiments to verify the effectiveness of the re-defined phoneme-likely units (PLUs) for the phonetic decision tree based HM-Net (Hidden Markov Network) context-dependent (CD) acoustic modeling in Korean appropriately. In case of the 48 PLUs, the phonemes /ㅂ/, /ㄷ/, /ㄱ/ are separated by initial sound, medial vowel, final consonant, and the consonants /ㄹ/, /ㅈ/, /ㅎ/ are also separated by initial sound, final consonant according to the position of syllable, word, and sentence, respectively. In this paper. therefore, we re-define the 39 PLUs by unifying the one phoneme in the separated initial sound, medial vowel, and final consonant of the 48 PLUs to construct the CD acoustic models effectively. Through the experimental results using the re-defined 39 PLUs, in word recognition experiments with the context-independent (CI) acoustic models, the 48 PLUs has an average of 7.06%, higher recognition accuracy than the 39 PLUs used. But in the speaker-independent word recognition experiments with the CD acoustic models, the 39 PLUs has an average of 0.61% better recognition accuracy than the 48 PLUs used. In the 4 continuous digits recognition experiments with the liaison phenomena. the 39 PLUs has also an average of 6.55% higher recognition accuracy. And then, in continuous speech recognition experiments, the 39 PLUs has an average of 15.08% better recognition accuracy than the 48 PLUs used too. Finally, though the 48, 39 PLUs have the lower recognition accuracy, the 39 PLUs has an average of 1.17% higher recognition characteristic than the 48 PLUs used in the task-independent word recognition experiments according to the unknown contextual factor. Through the above experiments, we verified the effectiveness of the re-defined 39 PLUs compared to the 48PLUs to construct the CD acoustic models in this paper.

Implementation of Real-time Vowel Recognition Mouse based on Smartphone (스마트폰 기반의 실시간 모음 인식 마우스 구현)

  • Jang, Taeung;Kim, Hyeonyong;Kim, Byeongman;Chung, Hae
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.531-536
    • /
    • 2015
  • The speech recognition is an active research area in the human computer interface (HCI). The objective of this study is to control digital devices with voices. In addition, the mouse is used as a computer peripheral tool which is widely used and provided in graphical user interface (GUI) computing environments. In this paper, we propose a method of controlling the mouse with the real-time speech recognition function of a smartphone. The processing steps include extracting the core voice signal after receiving a proper length voice input with real time, to perform the quantization by using the learned code book after feature extracting with mel frequency cepstral coefficient (MFCC), and to finally recognize the corresponding vowel using hidden markov model (HMM). In addition a virtual mouse is operated by mapping each vowel to the mouse command. Finally, we show the various mouse operations on the desktop PC display with the implemented smartphone application.

CROSS-LANGUAGE SPEECH PERCEPTION BY KOREAN AND POLISH.

  • Paradowska, Anna
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.178-178
    • /
    • 2000
  • This paper IS concerned with adults' foreign language aquisition and intends to research the relationship between the mother tongue's phonetic system (L1) and the perception of the foreign language (L2), in this paper Polish and Korean. The questions that are to help to define the aforementioned relationship are I) how Polish perceive Korean vowels, 2) how Koreans perceive Polish vowels, and 3) how Koreans perceive Korean vowels pronounced by Poles. In order to identify L2's vowels, the listeners try to fit them into the categories of their own language (L1). On the one hand, vowels that are the same in both languages and those that are articulated where no other vowel is articulated, have the best rate of recognition. For example, /i/ in both languages is a front close vowel and in both languages there are no other front close vowels. Therefore, vowels /i/ (and /a/) have the best rate of recognition in all three experiments. On the other hand, vowels that are unfamiliar to the listeners do not seem to have the worst rate of recognition. The vowels that have the worst rate of recognition are those, that are similar, but not quite the same as those of L1. This research proves that "equivalence classification prevents L2 learners from producing similar L2 phones, but not new L2 phones, authentically" (Flege, 1987). Polish speakers can pronounce unfamiliar L2 vowels "more authentically" than those similar to L1 vowels. However, the difference is not significant and this subject requires further research (different data, more informants).

  • PDF