• Title/Summary/Keyword: Korean consonants

Search Result 400, Processing Time 0.034 seconds

Phoneme Segmentation based on Volatility and Bulk Indicators in Korean Speech Recognition (한국어 음성 인식에서 변동성과 벌크 지표에 기반한 음소 경계 검출)

  • Lee, Jae Won
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.10
    • /
    • pp.631-638
    • /
    • 2015
  • Today, the demand for speech recognition systems in mobile environments is increasing rapidly. This paper proposes a novel method for Korean phoneme segmentation that is applicable to a phoneme based Korean speech recognition system. First, the input signal constitutes blocks of the same size. The proposed method is based on a volatility indicator calculated for each block of the input speech signal, and the bulk indicators calculated for each bulk in blocks, where a bulk is a set of adjacent samples that have the same sign as that of the primitive indicators for phoneme segmentation. The input signal vowels, voiced consonants, and voiceless consonants are sequentially recognized and the boundaries among phonemes are found using three devoted recognition algorithms that combine the two types of primitive indicators. The experimental results show that the proposed method can markedly reduce the error rate of the existing phoneme segmentation method.

Amethod for the Display of Hangout in its traditional Combined Form (한글문자 모아쓰기 Display의 한방안)

  • 안수길
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.12 no.1
    • /
    • pp.27-33
    • /
    • 1975
  • The required minimum size of character diode matrix of Korean letters is estimated from the topological complexity of letter structure. The OR aombination of three letter boards (diode matrice) gives all possible Hangout whole letters in proper traditional combined form with minimum required discernibility. Two forms of first consonants (centre located ones for horizontal vowels and leftward displaced ones for vertical and composed vowels) are switched by only 1 bit of the vowel code. The vowel pattern length is modified by again the last four bits of the code. A new 15bit sized inner code is proposed which permits considerably small sized decoding mechanism.

  • PDF

The Movements of Vocal Folds during Voice Onset Time of Korean Stops

  • Hong, Ki-Hwan;Kim, Hyun-Ki;Yang, Yoon-Soo;Kim, Bum-Kyu;Lee, Sang-Heon
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.17-26
    • /
    • 2002
  • Voice onset time (VOT) is defined as the time interval from the oral release of a stop consonant to the onset of glottal pulsing in the following vowel. VOT is a temporal characteristic of stop consonants that reflects the complex timing of glottal articulation relative to supraglottal articulation. There have been many reports on efforts to clarify the acoustical and physiological properties that differentiate the three types of Korean stops, including acoustic, fiberscopic, aerodynamic and electromyographic studies. In the acoustic and fiberscopic studies for stop consonants, the voice onset time and glottal width during the production of stops has been known as the longest and largest in the heavily aspirated type followed by the slightly aspirated type and unaspirated types. The thyroarytenoid and posterior cricoarytenoid muscles were physiologically inter-correlated for differentiating these types of stops. However, a review of the English literature shows that the fine movement of the mucosal edges of the vocal folds during the production of stops has not been well documented. In recent. years, a new method for high-speed recording of laryngeal dynamics by use of a digital recording system allows us to observe with fine time resolution. The movements of the vocal fold edges were documented during the period of stop production using a fiberscopic system of high speed digital images. By observing the glottal width and the visual vibratory movements of the vocal folds before voice onset, the heavily aspirated stop was characterized as being more prominent and dynamic than the slightly aspirated and unaspirated stops.

  • PDF

Nonlinear Interaction between Consonant and Vowel Features in Korean Syllable Perception (한국어 단음절에서 자음과 모음 자질의 비선형적 지각)

  • Bae, Moon-Jung
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.29-38
    • /
    • 2009
  • This study investigated the interaction between consonants and vowels in Korean syllable perception using a speeded classification task (Garner, 1978). Experiment 1 examined whether listeners analytically perceive the component phonemes in CV monosyllables when classification is based on the component phonemes (a consonant or a vowel) and observed a significant redundancy gain and a Garner interference effect. These results imply that the perception of the component phonemes in a CV syllable is not linear. Experiment 2 examined the further relation between consonants and vowels at a subphonemic level comparing classification times based on glottal features (aspiration and lax), on place of articulation features (labial and coronal), and on vowel features (front and back). Across all feature classifications, there were significant but asymmetric interference effects. Glottal feature.based classification showed the least amount of interference effect, while vowel feature.based classification showed moderate interference, and place of articulation feature-based classification showed the most interference. These results show that glottal features are more independent to vowels, but place features are more dependent to vowels in syllable perception. To examine the three-way interaction among glottal, place of articulation, and vowel features, Experiment 3 featured a modified Garner task. The outcome of this experiment indicated that glottal consonant features are independent to both the place of articulation and vowel features, but the place of articulation features are dependent to glottal and vowel features. These results were interpreted to show that speech perception is not abstract and discrete, but nonlinear, and that the perception of features corresponds to the hierarchical organization of articulatory features which is suggested in nonlinear phonology (Clements, 1991; Browman and Goldstein, 1989).

  • PDF

The Effect of Stimulus-Response Compatibility on Hangul Transcription Typing Behavior (한글타자 행동에서 자극-반응 합치도 효과)

  • 조양석;황태웅
    • Korean Journal of Cognitive Science
    • /
    • v.5 no.2
    • /
    • pp.25-45
    • /
    • 1994
  • The presnt study investigated the effect of stimulus-response compatibility (S-R compatibility) on Hangul transcription typing. In this experiment, two condition were manipulated, the first was a low S-R compatibility condition in which consonants were typed with left hand and vowels with right hand; the second was a high S-R compatibility condition in which hands for consonants and vowels were reversed. Subjects were requested to type the letter presented on the screen as accurately and immediately as possible. It was found that the compatibility interacted with the vowel shape. That is, in the high S-R compatibility condition, the response time was shorter when letters of vertically- shaped vowel were typed than when those of horizontally-shaped vowel were typed. In the low S-R compatibility codition, however, the response time was shorter for letters of horizontally-shaped vowel than for those of vertically-shaped vowel.

  • PDF

Analysis of Feature Extraction Methods for Distinguishing the Speech of Cleft Palate Patients (구개열 환자 발음 판별을 위한 특징 추출 방법 분석)

  • Kim, Sung Min;Kim, Wooil;Kwon, Tack-Kyun;Sung, Myung-Whun;Sung, Mee Young
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1372-1379
    • /
    • 2015
  • This paper presents an analysis of feature extraction methods used for distinguishing the speech of patients with cleft palates and people with normal palates. This research is a basic study on the development of a software system for automatic recognition and restoration of speech disorders, in pursuit of improving the welfare of speech disabled persons. Monosyllable voice data for experiments were collected for three groups: normal speech, cleft palate speech, and simulated clef palate speech. The data consists of 14 basic Korean consonants, 5 complex consonants, and 7 vowels. Feature extractions are performed using three well-known methods: LPC, MFCC, and PLP. The pattern recognition process is executed using the acoustic model GMM. From our experiments, we concluded that the MFCC method is generally the most effective way to identify speech distortions. These results may contribute to the automatic detection and correction of the distorted speech of cleft palate patients, along with the development of an identification tool for levels of speech distortion.

Early Linguistic Developments of Simultaneous Bilateral Cochlear Implantees (양이 동시 인공와우 사용자의 조기 언어발달)

  • Suh, Michelle J.;Lee, Hyun-Jin;Choi, Hyun Seung
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.61 no.12
    • /
    • pp.650-657
    • /
    • 2018
  • Background and Objectives The present study aimed to compare receptive and expressive language development in children who have undergone simultaneous bilateral cochlear implantation (SCI) and those who have undergone bimodal stimulation (unilateral CI+ hearing aid). Subjects and Method In a retrospective analysis of clinical data, 15 pediatric patients who have received SCI and nine patients who have received bimodal stimulation (BM group) were enrolled. CI was performed for all patients at 24 months of age. Category of Auditory Performance (CAP) scores, Infant-Toddler Meaningful Auditory Integration Scale (IT-MAIS) scores, and developmental quotients (DQ) for expressive and receptive language were compared between the groups at 12 month of follow-up. The Percentage of Consonants Correct (PCC) of children evaluated at 4 years old was also compared. Results At 12 months of follow-up, significantly greater improvements in CAP scores (${\Delta}4.25{\pm}0.5$) were noted in the SCI group compared to the BM group (${\Delta}3.56{\pm}0.88$, p=0.041). Significantly greater improvements in IT-MAIS scores were also noted in the SCI group (${\Delta}36.17{\pm}4.09$) than in the BM group (${\Delta}30.17{\pm}2.91$, p=0.004). The DQ of receptive language was higher in the SCI group than in the BM group ($87.6{\pm}15.4%$ vs. $75.5{\pm}12.0%$, p=0.023) at 12 months of follow-up. Moreover, early SCI was associated with better receptive language skills. PCC index of children at 4 years old was higher in the SCI group than in the BM group ($88.5{\pm}13.2%$ vs. $62{\pm}15.8%$, p=0.014). Earlier SCI was associated with even greater improvements. Conclusion Bilateral SCI is associated with significant improvements in language development when compared with bimodal stimulation. Earlier SCI was associated with better outcomes.

Speech Transition Detection and approximate-synthesis Method for Speech Signal Compression and Recovery (음성신호 압축 및 복원을 위한 음성 천이구간 검출과 근사합성 방식)

  • Lee, Kwang-Seok;Kim, Bong-Gi;Kang, Seong-Soo;Kim, Hyun-Deok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.763-767
    • /
    • 2008
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high quality approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.

  • PDF

The Acoustic Changes of Voice after Uvulopalatopharyngoplasty (구개인두성형술 후 음성의 음향학적 변화)

  • Hong, K.H.;Kim, S.W.;Yoon, H.W.;Cho, Y.S.;Moon, S.H.;Lee, S.H.
    • Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.23-37
    • /
    • 2001
  • The primary sound produced by the vibration of vocal folds reaches the velopharyngeal isthmus and is directed both nasally and orally. The proportions of the each component is determined by the anatomical and functional status of the soft palate. The oral sounds composed of oral vowels and consonants according to the status of vocal tract, tongue, palate and lips. The nasal sounds composed of nasal consonants and nasal vowels, and further modified according to the status of the nasal airway, so anatomical abnormalities in the nasal cavity will influence nasal sound. The measurement of nasal sounds of speech has relied on the subjective scoring by listeners. The nasal sounds are described with nasality and nasalization. Generally, nasality has been assessed perceptually in the effect of maxillofacial procedures for cleft palate, sleep apnea, snoring and nasal disorders. The nasalization is considered as an acoustic phenomenon. Snoring and sleep apnea is a typical disorders due to abundant velopharynx. The sleep apnea has been known as a cessation of breathing for at least 10 seconds during sleep. Several medical and surgical methods for treating sleep apnea have been attempted. The uvulopalatopharyngoplasty(UPPP) involves removal of 1.0 to 3.0 cm of soft palate tissue with removal of redundant oropharyngeal mucosa and lateral tissue from the anterior and sometimes posterior faucial pillars. This procedure results in a shortened soft palate and a possible risk following this surgery may be velopharyngeal malfunctioning due to the shortened palate. Few researchers have systematically studied the effects of this surgery as it relates to speech production. Some changes in the voice quality such as resonance (nasality), articulation, and phonation have been reported. In view of the conflicting reports discussed, there remains some uncertainty about the speech status in patients following the snoring and sleep apnea surgery. The study was conducted in two phases: 1) acoustic analysis of oral and nasal sounds, and 2) evaluation of nasality.

  • PDF

Comparisons of Recognition Rates for the Off-line Handwritten Hangul using Learning Codes based on Neural Network (신경망 학습 코드에 따른 오프라인 필기체 한글 인식률 비교)

  • Kim, Mi-Young;Cho, Yong-Beom
    • Journal of IKEEE
    • /
    • v.2 no.1 s.2
    • /
    • pp.150-159
    • /
    • 1998
  • This paper described the recognition of the Off-line handwritten Hangul based on neural network using a feature extraction method. Features of Hangul can be extracted by a $5{\times}5$ window method which is the modified $3{\times}3$ mask method. These features are coded to binary patterns in order to use neural network's inputs efficiently. Hangul character is recognized by the consonant, the vertical vowel, and the horizontal vowel, separately. In order to verify the recognition rate, three different coding methods were used for neural networks. Three methods were the fixed-code method, the learned-code I method, and the learned-code II method. The result was shown that the learned-code II method was the best among three methods. The result of the learned-code II method was shown 100% recognition rate for the vertical vowel, 100% for the horizontal vowel, and 98.33% for the learned consonants and 93.75% for the new consonants.

  • PDF