Search | Korea Science

Development of FSN-based Large Vocabulary Continuous Speech Recognition System (FSN 기반의 대어휘 연속음성인식 시스템 개발)

Park, Jeon-Gue;Lee, Yun-Keun
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.327-329
- /
- 2007
This paper presents a FSN-based LVCSR system and it's application to the speech TV program guide. Unlike the most popular statistical language model-based system, we used FSN grammar based on the graph theory-based FSN optimization algorithm and knowledge-based advanced word boundary modeling. For the memory and latency efficiency, we implemented the dynamic pruning scheduling based on the histogram of active words and their likelihood distribution. We achieved a 10.7% word accuracy improvement with 57.3% speedup.
PDF

Parafoveal Semantic Preview Effect in Reading of Chinese-Korean Bilinguals (글 읽기에서 나타난 중심와주변 의미 미리보기 효과 : 중국어-한국어 이중언어자 대상으로)

Wang, Shang;Choo, Hyeree;Koh, Sungryoung
- Korean Journal of Cognitive Science
- /
- v.34 no.4
- /
- pp.315-347
- /
- 2023
This study aimed to investigate the semantic preview effect in the parafoveal processing of words that are presented in advance in the parafoveal area ahead of the fixation point, benefiting word processing in the fovea. Using the boundary technique in eye-tracking experiments, 25 Chinese-Korean bilinguals, whose native language is Chinese, were presented with 96 sentences that contained a mix of Chinese and Korean, where Korean words were associated with Chinese characters semantically. The study aimed to determine whether a semantic preview effect could be extracted in reading. The experimental sentences were divided into four conditions: the same Korean native word condition (e.g., "나라" meaning "country"), the same Korean word with semantic equivalent in Chinese condition (e.g., "국가" meaning "country"), the same Chinese condition with semantic equivalent in Korean (e.g., "国家" meaning "country"), and the unrelated Chinese condition to the target word (e.g., "围裙" meaning "apron"). The results showed a preview effect in both the Korean word and Chinese word conditions, with a larger preview effect observed in the Chinese word condition compared to the Korean word condition.
https://doi.org/10.19066/cogsci.2023.34.4.004 인용 PDF

Korean native speakers' perceptive aspects on Korean wh & yes-no questions produced by Chinese Korean learners (중국인학습자들의 한국어 의문사의문문과 부정사의문문에 대한 한국어원어민 화자의 지각양상)

Yune, YoungSook
- Phonetics and Speech Sciences
- /
- v.6 no.4
- /
- pp.37-45
- /
- 2014
Korean wh-questions and yes-no questions have morphologically the same structure. In speech, however, two types of questions are distinguished by prosodic difference. In this study, we examined if Korean native speakers can distinguish wh-question and yes-no questions produced by Chinese Korean leaners based on the prosodic information contained in the sentences. For this purpose, we performed perception analysis, and 15 Korean native speakers participated in the perception test. The results show that two types of interrogative sentences produced by Chinese Korean leaners were not distinguished by constant pitch contours. These results reveal that Chinese Korean leaners cannot match prosodic meaning and prosodic form. The most saliant prosodic feature used perceptually by native speakers to discriminate two types of interrogative sentences is pitch difference between the F0 pick of wh-word and boundary tone.
https://doi.org/10.13064/KSSS.2014.6.4.037 인용 PDF KSCI

Gradient Reduction of $C_1$ in /pk/ Sequences

Son, Min-Jung
- Speech Sciences
- /
- v.15 no.4
- /
- pp.43-60
- /
- 2008
Instrumental studies (e.g., aerodynamic, EPG, and EMMA) have shown that the first of two stops in sequence can be articulatorily reduced in time and space sometimes; either gradient or categorical. The current EMMA study aims to examine possible factors_linguistic (e.g., speech rate, word boundary, and prosodic boundary) and paralinguistic (e.g., natural context and repetition)_to induce gradient reduction of $C_1$ in /pk/ cluster sequences. EMMA data are collected from five Seoul-Korean speakers. The results show that gradient reduction of lip aperture seldom occurs, being quite restricted both in speaker frequency and in token frequency. The results also suggest that the place assimilation is not a lexical process, implying that speakers have not fully developed this process to be phonologized in the abstract level.
PDF

ToBI Based Prosodic Representation of the Kyungnam Dialect of Korean

Cho, Yong-Hyung
- Speech Sciences
- /
- v.2
- /
- pp.159-172
- /
- 1997
This paper proposes a prosodic representation system of the Kyungnam dialect of Korean, based on the ToBI system. In this system, diverse intonation patterns are transcribed on the four parallel tiers: a tone tier, a break index tier, an orthographic tier, and a miscellaneous tier. The tone tier employs pitch accents, phrase accents, and boundary tones marked with diacritics in order to represent various pitch events. The break index tier uses five break indices, numbered from 0 to 4, in order to represent degrees of connectiveness in speech by associating each inter-word position with a break index. In this, each break index represents a boundary of some kind of constituent. This system can contribute not only to a more detailed theory connecting prosody, syntax, and intonation, but also to current text-to-speech synthesis approaches, speech recognition, and other quantitative computational modellings.
PDF

Word Boundary Detection of Voice Signal Using Recurrent Fuzzy Associative Memory (순환 퍼지연상기억장치를 이용한 음성경계 추출)

마창수;김계영;최형일
- Proceedings of the Korean Information Science Society Conference
- /
- 2003.04c
- /
- pp.235-237
- /
- 2003
본 논문에서는 음성인식을 위한 전처리 단계로 음성인식의 대상을 찾아내는 음성경계 추출에 대하여 기술한다. 음성경계 추출을 위한 특징 벡터로는 시간 정보인 RMS와 주파수 정보인 MFBE를 사용한다. 사용하는 알고리즘은 학습을 통해 규칙을 생성하는 퍼지연상기억장치에 음성의 시간 정보를 적용하기 위해 순환노드를 추가한 새로운 형태의 순환 퍼지연상기억장치를 제안한다.
PDF

Focus and Prosodic Structure

Oh, Mi-Ra
- Speech Sciences
- /
- v.8 no.1
- /
- pp.21-31
- /
- 2001
The effects of focus on prosodic phrasing, F0, and duration are investigated paying attention not only to the target of focus but also to the constituents that are outside the domain of focus in Korean. We find that the constituents preceding and following the focused word tend to be dephrased. Dephrasing does not always cover up to the Intonation Phrase boundary contrary to Jun's (1993) claim. Dephrasing caused by focus determines F0 and durational difference between focused and neutral sentences. Syntactic constituency is also shown to playa role in prosodic phrasing.
PDF

Korean Head-Tail Tokenization and Part-of-Speech Tagging by using Deep Learning (딥러닝을 이용한 한국어 Head-Tail 토큰화 기법과 품사 태깅)

Kim, Jungmin;Kang, Seungshik;Kim, Hyeokman
- IEMEK Journal of Embedded Systems and Applications
- /
- v.17 no.4
- /
- pp.199-208
- /
- 2022
Korean is an agglutinative language, and one or more morphemes are combined to form a single word. Part-of-speech tagging method separates each morpheme from a word and attaches a part-of-speech tag. In this study, we propose a new Korean part-of-speech tagging method based on the Head-Tail tokenization technique that divides a word into a lexical morpheme part and a grammatical morpheme part without decomposing compound words. In this method, the Head-Tail is divided by the syllable boundary without restoring irregular deformation or abbreviated syllables. Korean part-of-speech tagger was implemented using the Head-Tail tokenization and deep learning technique. In order to solve the problem that a large number of complex tags are generated due to the segmented tags and the tagging accuracy is low, we reduced the number of tags to a complex tag composed of large classification tags, and as a result, we improved the tagging accuracy. The performance of the Head-Tail part-of-speech tagger was experimented by using BERT, syllable bigram, and subword bigram embedding, and both syllable bigram and subword bigram embedding showed improvement in performance compared to general BERT. Part-of-speech tagging was performed by integrating the Head-Tail tokenization model and the simplified part-of-speech tagging model, achieving 98.99% word unit accuracy and 99.08% token unit accuracy. As a result of the experiment, it was found that the performance of part-of-speech tagging improved when the maximum token length was limited to twice the number of words.
https://doi.org/10.14372/IEMEK.2022.17.4.199 인용 PDF KSCI

The Rule of Korean Pitch Variation for a Natural Synthetic Female Voice (자연스러운 여성 합성음을 위한 한국어의 피치 변화 법칙)

Kim, Chung-Won;Park, Dae-Duck;Kim, Boh-Hyun;Kwon, Cheol-Hong
- The Journal of the Acoustical Society of Korea
- /
- v.15 no.6
- /
- pp.26-32
- /
- 1996
In this paper we make a rule of pitch variation for a natural synthetic female voice. Intonation phrase, which is the basic unit the rule is applied to, mostly consists of a syllable or syllables. The pitch values of the first, second, and final syllables make up the pitch contour of the intonation phrase. Those of the first and second syllable are determined by the initial consonants of the respective syllables, and that of the final syllable by the type of the function word. There are two kinds of boundaries between intonation phrases. One is a boundary with pause, and the other is a boundary without pause. The pitch contour of the intonation phrase with the boundary phenomena determines the pitch pattern of a sentence.
PDF

Automatic Word Spacing of the Korean Sentences by Using End-to-End Deep Neural Network (종단 간 심층 신경망을 이용한 한국어 문장 자동 띄어쓰기)

Lee, Hyun Young;Kang, Seung Shik
- KIPS Transactions on Software and Data Engineering
- /
- v.8 no.11
- /
- pp.441-448
- /
- 2019
Previous researches on automatic spacing of Korean sentences has been researched to correct spacing errors by using n-gram based statistical techniques or morpheme analyzer to insert blanks in the word boundary. In this paper, we propose an end-to-end automatic word spacing by using deep neural network. Automatic word spacing problem could be defined as a tag classification problem in unit of syllable other than word. For contextual representation between syllables, Bi-LSTM encodes the dependency relationship between syllables into a fixed-length vector of continuous vector space using forward and backward LSTM cell. In order to conduct automatic word spacing of Korean sentences, after a fixed-length contextual vector by Bi-LSTM is classified into auto-spacing tag(B or I), the blank is inserted in the front of B tag. For tag classification method, we compose three types of classification neural networks. One is feedforward neural network, another is neural network language model and the other is linear-chain CRF. To compare our models, we measure the performance of automatic word spacing depending on the three of classification networks. linear-chain CRF of them used as classification neural network shows better performance than other models. We used KCC150 corpus as a training and testing data.
https://doi.org/10.3745/KTSDE.2019.8.11.441 인용 PDF KSCI

Search Result 89, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)