Search | Korea Science

Speech Signal Compression and Recovery Using Transition Detection and Approximate-Synthesis (천이구간 추출 및 근사합성에 의한 음성신호 압축과 복원)

Lee, Kwang-Seok;Lee, Byeong-Ro
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.13 no.2
- /
- pp.413-418
- /
- 2009
In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech qualify in case coexist with a voiced and an unvoiced consonants in a frame. So, We proposed TS(Transition Segment) including unvoiced consonant searching and extraction method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This research present a new method of TS approximate-synthesis by using Least Mean Square and frequency band division. As a result, this method obtain a high qualify approximation-synthesis waveforms within TS by using frequency information of 0.547kHz below and 2.813kHz above. The important thing is that the maximum error signal can be made with low distortion approximation-synthesis waveform within TS. This method has the capability of being applied to a new speech coding of Voiced/Silence/TS, speech analysis and speech synthesis.
https://doi.org/10.6109/JKIICE.2009.13.2.413 인용 PDF KSCI

The Comparison of Aerodynamic Measures in Korean Stop Consonants based on Phonation Types (한국어 파열음의 발성 유형에 따른 공기역학 측정치 비교)

Choi, Seong Hee;Choi, Chul-Hee
- Phonetics and Speech Sciences
- /
- v.6 no.4
- /
- pp.195-203
- /
- 2014
The aim of this study was to investigate the effects of phonation types ([+/- aspirated], [+/- fortis]) on aerodynamic measures with Korean bilabial stops. Sixty-three healthy young adults (30 males, 33 females) participated to evaluate the VOEF (Voicing Efficiency) tasks with bilabial stop consonants /$p^h$/, /p/, /p'/ using Phonatory Aerodynamic System (PAS) Model 6600 (Kay PENTAX Corp, Lincoln Park, NJ). All VOEF measures were significantly influenced by phonation types except RANP(pitch range)(p <.01). For sound pressure, maximum SPL, mean SPL, and Mean SPL during Voicing have been shown to be significantly greatest in fortis stop /p'/ than aspirated /$p^h$/ and lenis stop /p/ (p<.001). On the other hand, mean pitch after lenis stop was significantly lower than after aspirated and fortis stops (p<.001). Peak expiratory airflow, Target airflow, and FVC (Expiratory volume) were significantly lowest in fortis stop /p'/ which might be associated with higher aerodynamic resistance while peak air pressure and mean peak air pressure during closure were significantly lower in lenis stop /p/. Additionally, AEFF (Aerodynamic efficiency) was significantly higher in fortis stop /p'/ than lenis stop /p/ as well as aspirated stop /$p^h$/ (p<.001). Thus, sound pressure, airflow parameters, and aerodynamic resistance made crucial roles in distinguishing fortis /p'/ from lenis stop /p/ and aspirated. Additionally, pitch and subglottal air pressure parameters were important aerodynamic characteristics in distinguishing lenis /p/ from fortis /p'/ and aspirated /$p^h$/. Therefore, accurate aspirated /p/ stop consonant should be elicited when collecting the airflow, intraoral pressure related data with patients with voice disorders in order to enhance the reliability and relevance or validity of aerodynamic measures using PAS.
https://doi.org/10.13064/KSSS.2014.6.4.195 인용 PDF KSCI

Korean first graders' word decoding skills, phonological awareness, rapid automatized naming, and letter knowledge with/without developmental dyslexia (초등 1학년 발달성 난독 아동의 낱말 해독, 음운인식, 빠른 이름대기, 자소 지식)

Yang, Yuna;Pae, Soyeong
- Phonetics and Speech Sciences
- /
- v.10 no.2
- /
- pp.51-60
- /
- 2018
This study aims to compare the word decoding skills, phonological awareness (PA), rapid automatized naming (RAN) skills, and letter knowledge of first graders with developmental dyslexia (DD) and those who were typically developing (TD). Eighteen children with DD and eighteen TD children, matched by nonverbal intelligence and discourse ability, participated in the study. Word decoding of Korean language-based reading assessment(Pae et al., 2015) was conducted. Phoneme-grapheme correspondent words were analyzed according to whether the word has meaning, whether the syllable has a final consonant, and the position of the grapheme in the syllable. Letter knowledge asked about the names and sounds of 12 consonants and 6 vowels. The children's PA of word, syllable, body-coda, and phoneme blending was tested. Object and letter RAN was measured in seconds. The decoding difficulty of non-words was more noticeable in the DD group than in the TD one. The TD children read the syllable initial and syllable final position with 99% correctness. Children with DD read with 80% and 82% correctness, respectively. In addition, the DD group had more difficulty in decoding words with two patchims when compared with the TD one. The DD group read only 57% of words with two patchims correctly, while the TD one read 91% correctly. There were significant differences in body-coda PA, phoneme level PA, letter RAN, object RAN, and letter-sound knowledge between the two groups. This study confirms the existence of Korean developmental dyslexics, and the urgent need for the inclusion of a Korean-specific phonics approach in the education system.
https://doi.org/10.13064/KSSS.2018.10.2.051 인용 PDF KSCI

A Study on Processing of Speech Recognition Korean Words (한글 단어의 음성 인식 처리에 관한 연구)

Nam, Kihun
- The Journal of the Convergence on Culture Technology
- /
- v.5 no.4
- /
- pp.407-412
- /
- 2019
In this paper, we propose a technique for processing of speech recognition in korean words. Speech recognition is a technology that converts acoustic signals from sensors such as microphones into words or sentences. Most foreign languages have less difficulty in speech recognition. On the other hand, korean consists of vowels and bottom consonants, so it is inappropriate to use the letters obtained from the voice synthesis system. That improving the conventional structure speech recognition can the correct words recognition. In order to solve this problem, a new algorithm was added to the existing speech recognition structure to increase the speech recognition rate. Perform the preprocessing process of the word and then token the results. After combining the result processed in the Levenshtein distance algorithm and the hashing algorithm, the normalized words is output through the consonant comparison algorithm. The final result word is compared with the standardized table and output if it exists, registered in the table dose not exists. The experimental environment was developed by using a smartphone application. The proposed structure shows that the recognition rate is improved by 2% in standard language and 7% in dialect.
https://doi.org/10.17703/JCCT.2019.5.4.407 인용 PDF KSCI

Interpretation of Estoppel Doctrine in the Letter of Credit Transaction : Comparison between UCP 500 and 95 UCC (신용장거래(信用狀去來)에서의 금반언법리(禁反言法理)에 관한 해석(解釋) - UCP 500 제13조, 제14조와 95 UCC 제5-108조의 비교를 중심으로 -)

Kim, Young-Hoon
- THE INTERNATIONAL COMMERCE & LAW REVIEW
- /
- v.12
- /
- pp.429-460
- /
- 1999
The letter of credit is quintessentially international. In the absence of international legal system, a private system based on banking practices has evolved, commanding the adherence of the international letter of credit community and providing the foundation of th reputation of this instrument. To maintain this international system, it is vital that international standard banking practice should not be subject to local interpretations that misconstrue or distort it. The UCP is a formulation of international standard banking practice. It is neither positive law nor a "contract term" in any traditional sense and its interpretation must be consonant with its character as a living repositary of international understanding in this field. As a result, the interpretation and application of specific articles of the UCP must be consistent with its evolving character and history and with the principles upon which sound letter of credit practice is predicated. This study, especially, focuses on article 13 and article 14 of the UCP500. Article 13(b) of UCP500 stipulates that banks will have a reasonable time, not to exceed seven days, to examine documents to determine whether they comply facially with the terms of the credit. The seven-day provision is not designed as a safe harbor, because the rule requires the issuer to act within a reasonable time. But, by virtue of the deletion of the preclusion rule in the document examination article in UCP500, however, seven days may evolve as something of a safe harbor, especially for banks that engage in strategic behavior. True, under UCP500 banks are supposed to examine documents within a reasonable time, but there are no consequences in UCP500 for a bank's violation of that duty. It is only in the next provision. Courts might read the preclusion more broadly than the literal reading mentioned here or might fashion a common-law preclusion rule that does not require a showing of detriment. Absent that kind of development, the change in the preclusion rule could have adverse effects on the beneficiary. The penalty, strict estoppel or strict preclusion, under UCP500 and 95UCC differs from the classic estoppel. The classic estoppel rule requires a beneficiary to show three elements. 1. conduct on the part of the issuer that leads the beneficiary to believe that nonconforming documents do conform; 2. reasonable reliance by the beneficiary; and 3. detriment from that reliance. But stict preclusion rule needs not detrimental reliance. This strict estoppel rule is quite strict, and some see it as a fitting pro-beneficiary rule to counterbalance the usually pro-issuer rule of strict compliance.
PDF

Comparative Analysis on Pronunciation Contents in Korean Integrated Textbooks (한국어 통합 교재에 나타난 발음 내용의 비교 분석)

Park, Eunha
- The Journal of the Korea Contents Association
- /
- v.18 no.4
- /
- pp.268-278
- /
- 2018
The purpose of this study is to compare and analyze phonetic items such as the phonemic system, phonological rules, and pronunciation descriptions and notations incorporated in the textbooks. Based on our analysis results, we point out the problems related to pronunciation education, and suggest directions for improvement. First, the presentation order of consonants and vowels in the phonological systems sections of each textbook was different. We recommend that a standard for consonant and vowel presentation order should be prepared, but that this standard should take into consideration the specific purpose of the textbook; the learning strategies and goals, as well as the possibility of teaching and learning. Second, similar to phonemic systems, the presentation order of phonological rules was different for each textbook. To create a standard order for phonological rules, we have to standardize the order of presentation of rules and determine which rules should be presented. Furthermore, when describing phonological rules, the content should be described in common and essential terms as much as possible without the use of jargon. Third, in other matters of pronunciation, there were problems such as examples for pronunciation and lack of exercises. Regarding this, we propose to provide sentences or dialogues as examples for pronunciation, and to link these to various activities and other language functions for pronunciation practice.
https://doi.org/10.5392/JKCA.2018.18.04.268 인용 PDF KSCI

Comparison of Feature Performance in Off-line Hanwritten Korean Alphabet Recognition (오프라인 필기체 한글 자소 인식에 있어서 특징성능의 비교)

Ko, Tae-Seog;Kim, Jong-Ryeol;Chung, Kyu-Sik
- Korean Journal of Cognitive Science
- /
- v.7 no.1
- /
- pp.57-74
- /
- 1996
This paper presents a comparison of recognition performance of the features used inthe recent handwritten korean character recognition.This research aims at providing the basis for feature selecion in order to improve not only the recognition rate but also the efficiency of recognition system.For the comparison of feature performace,we analyzed the characteristics of theose features and then,classified them into three rypes:global feature(image transformation)type,statistical feature type,and local/ topological feature type.For each type,we selected four or five features which seem more suitable to represent the characteristics of korean alphabet,and performed recongition experiments for the first consonant,horizontal vowel,and vertical vowel of a korean character, respectively.The classifier used in our experiments is a multi-layered perceptron with one hidden layer which is trained with backpropagation algorithm.The training and test data in the experiment are taken from 30sets of PE92. Experimental results show that 1)local/topological features outperform the other two type features in terms of recognition rates 2)mesh and projection features in statical feature type,walsh and DCT features in global feature type,and gradient and concavity features in local/topological feature type outperform the others in each type, respectively.
PDF

Developing a New Algorithm for Conversational Agent to Detect Recognition Error and Neologism Meaning: Utilizing Korean Syllable-based Word Similarity (대화형 에이전트 인식오류 및 신조어 탐지를 위한 알고리즘 개발: 한글 음절 분리 기반의 단어 유사도 활용)

Jung-Won Lee;Il Im
- Journal of Intelligence and Information Systems
- /
- v.29 no.3
- /
- pp.267-286
- /
- 2023
The conversational agents such as AI speakers utilize voice conversation for human-computer interaction. Voice recognition errors often occur in conversational situations. Recognition errors in user utterance records can be categorized into two types. The first type is misrecognition errors, where the agent fails to recognize the user's speech entirely. The second type is misinterpretation errors, where the user's speech is recognized and services are provided, but the interpretation differs from the user's intention. Among these, misinterpretation errors require separate error detection as they are recorded as successful service interactions. In this study, various text separation methods were applied to detect misinterpretation. For each of these text separation methods, the similarity of consecutive speech pairs using word embedding and document embedding techniques, which convert words and documents into vectors. This approach goes beyond simple word-based similarity calculation to explore a new method for detecting misinterpretation errors. The research method involved utilizing real user utterance records to train and develop a detection model by applying patterns of misinterpretation error causes. The results revealed that the most significant analysis result was obtained through initial consonant extraction for detecting misinterpretation errors caused by the use of unregistered neologisms. Through comparison with other separation methods, different error types could be observed. This study has two main implications. First, for misinterpretation errors that are difficult to detect due to lack of recognition, the study proposed diverse text separation methods and found a novel method that improved performance remarkably. Second, if this is applied to conversational agents or voice recognition services requiring neologism detection, patterns of errors occurring from the voice recognition stage can be specified. The study proposed and verified that even if not categorized as errors, services can be provided according to user-desired results.
https://doi.org/10.13088/jiis.2023.29.3.267 인용 PDF

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
- Journal of Intelligence and Information Systems
- /
- v.23 no.2
- /
- pp.71-88
- /
- 2017
Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.
https://doi.org/10.13088/jiis.2017.23.2.071 인용 PDF KSCI

Search Result 89, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)