• Title/Summary/Keyword: Korean language level

Search Result 1,060, Processing Time 0.027 seconds

Integration of WFST Language Model in Pre-trained Korean E2E ASR Model

  • Junseok Oh;Eunsoo Cho;Ji-Hwan Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1692-1705
    • /
    • 2024
  • In this paper, we present a method that integrates a Grammar Transducer as an external language model to enhance the accuracy of the pre-trained Korean End-to-end (E2E) Automatic Speech Recognition (ASR) model. The E2E ASR model utilizes the Connectionist Temporal Classification (CTC) loss function to derive hypothesis sentences from input audio. However, this method reveals a limitation inherent in the CTC approach, as it fails to capture language information from transcript data directly. To overcome this limitation, we propose a fusion approach that combines a clause-level n-gram language model, transformed into a Weighted Finite-State Transducer (WFST), with the E2E ASR model. This approach enhances the model's accuracy and allows for domain adaptation using just additional text data, avoiding the need for further intensive training of the extensive pre-trained ASR model. This is particularly advantageous for Korean, characterized as a low-resource language, which confronts a significant challenge due to limited resources of speech data and available ASR models. Initially, we validate the efficacy of training the n-gram model at the clause-level by contrasting its inference accuracy with that of the E2E ASR model when merged with language models trained on smaller lexical units. We then demonstrate that our approach achieves enhanced domain adaptation accuracy compared to Shallow Fusion, a previously devised method for merging an external language model with an E2E ASR model without necessitating additional training.

Vocabulary Education for Korean Beginner Level Using PWIM (PWIM 활용 한국어 초급 어휘교육)

  • Cheng, Yeun sook;Lee, Byung woon
    • Journal of Korean language education
    • /
    • v.29 no.3
    • /
    • pp.325-344
    • /
    • 2018
  • The purpose of this study is to summarize PWIM (Picture Words Inductive Model) which is one of learner-centered vocabulary teaching-learning models, and suggest ways to implement them in Korean language education. The pictures that are used in the Korean language education field help visualize the specific shape, color, and texture of the vocabulary that is the learning target; thus, helping beginner learners to recognize the meaning of the sound. Visual material stimulates the intrinsic schema of the learner and not only becomes a 'bridge' connecting the mother tongue and the Korean language, but also reduces difficulty in learning a foreign language because of the ambiguity between meaning and sound in Korean and all languages. PWIM shows commonality with existing learning methods in that it uses visual materials. However, in the past, the teacher-centered learning method has only imitated the teacher because the teacher showed a piece-wise, out-of-life photograph and taught the word. PWIM is a learner-centered learning method that stimulates learners to find vocabulary on their own by presenting visual information reflecting the context. In this paper, PWIM is more suitable for beginner learners who are learning specific concrete vocabulary such as personal identity (mainly objects), residence and environment, daily life, shopping, health, climate, and traffic. The purpose of this study was to develop a method of using PWIM suitable for Korean language learners and teaching procedures. The researchers rearranged the previous research into three steps: brainstorming and word organization, generalization of semantic and morphological rules of extracted words, and application of words. In the case of PWIM, you can go through all three steps at once. Otherwise, it is possible to divide the three steps of PWIM and teach at different times. It is expected that teachers and learners using the PWIM teaching-learning method, which uses realistic visual materials, will enable making an effective class together.

eFlowC: A Packet Processing Language for Network Management (eFlowC : 네트워크 관리를 위한 패킷 처리 언어)

  • Ko, Bang-Won;Yoo, Jae-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.1
    • /
    • pp.65-76
    • /
    • 2014
  • In this paper, we propose a high-level programming language for packet processing called eFlowC and it supporting programming development environment. Based on the C language which is already familiar and easy to use to program developers, eFlowC maintains the similar syntax and semantics of C. Some features that are unnecessary for the packet processing has been removed from C, eFlowC is highly focused on performing packet data, database, string byte information checking and event processing. Design high-level programming languages and apply an existing language or compiler technology, language function and compilation process that is required for packet processing will be described. In order to use the DPIC device such as X11, we designed a virtual machine eFVM that takes into account the scalability and portability. We have evaluated the utility of the proposed language by experimenting a variety of real application programs with our programming environment such as compiler, simulator and debugger for eFVM. As there is little research that devoted to define the formats, meanings and functions of the packet processing language, this research is significant and expected to be a basis for the packet processing language.

The Effect of Dictation and Dramatization on Children's Story Construction and Decontextualized Language (유아의 이야기 짓기와 극화 활동의 연계가 유아의 이야기 구조 및 탈상황적 언어 발달에 미치는 영향)

  • Lee, Moom-jung
    • Korean Journal of Child Studies
    • /
    • v.22 no.1
    • /
    • pp.241-249
    • /
    • 2001
  • This study examined the effect of story dictation and dramatization on children's story construction and decontextualized language. For 12 weeks, the 22 five-year-old children in the experimental group participated in story dictation and dramatization activities while another 22 same-age children participated only in story dictation. The instruments were the children's Decontextualized Language Test(Foley, 1992) and children's Story Analysis(Knipping, 1987), revised to fit Korean grammar. Story dictation and dramatization facilitated high level story construction by children: it raised levels of story coherence and narrative form. Story dictation and dramatization also enhanced decontextualized language of children, raising their use of decontextualized language on a picture description task.

  • PDF

The Loom-LAG for syntax analysis Adding a language-independent level to LAG

  • Schulze, Markus
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2002.02a
    • /
    • pp.411-420
    • /
    • 2002
  • The left-associative grammar model (LAG) has been applied successfully to the morphologic and syntactic analysis of various european and asian languages. The algebraic definition of the LAG is very well suited for the application to natural language processing as it inherently obeys de Saussure's second law (de Saussure, 1913, p. 103) on the linear nature of language, which phrase-structure grammar (PSG) and categorial grammar (CG) do not. This paper describes the so-called Loom-LAGs (LLAG) -a specialization of LAGs for the analysis of natural language. Whereas the only means of language-independent abstraction in ordinary LAG is the principle of possible continuations, LLAGs introduce a set of more detailed language-independent generalizations that form the so-called loom of a Loom-LAG. Every LLAG uses the very smut loom and adds the language-specific information in the form of a declarative description of the language -much like an ancient mechanised Jacquard-loom would take a program-card providing the specific pattern for the cloth to be woven. The linguistic information is formulated declaratively in so-called syntax plans that describe the sequential structure of clauses and phrases. This approach introduces the explicit notion of phrases and sentence structure to LAG without violating de Saussure's second law iud without leaving the ground of the original algebraic definition of LAG, LLAGS can in fact be shown to be just a notational variant of LAG -but one that is much better suited for the manual development of syntax grammars for the robust analysis of free texts.

  • PDF

Syntax directed Compiler for Subset of PASCAL

  • 이태경
    • Communications of the Korean Institute of Information Scientists and Engineers
    • /
    • v.4 no.2
    • /
    • pp.65-73
    • /
    • 1986
  • The PM language is a Compiler writing language which syntax- directly translates a high level language into a intermediate language of matrix form. The PM assembler translates the PM language into recursive subroutines which test input strings or output intermediate terms or call another subroutines. A large subset of PASCAL compiler was written in the PM language.

The Relationship between Mothers' Attachment Levels, Types of Verbal Control, and Infants' Language Development (어머니 애착수준 및 언어통제유형과 영아의 언어발달 간의 관계)

  • Nam, Hyo Jung;Jahng, Kyung Eun
    • Korean Journal of Child Studies
    • /
    • v.36 no.4
    • /
    • pp.143-161
    • /
    • 2015
  • This study aims to examine the relationship between mothers' attachment levels, types of verbal control, and infants' language development. The selected participants comprised 224 infants, aged 24-35 months and their mothers (224) at 25 long day care centers located Goyang-si, Gimpo-si in Gyeonggi-do, Incheon and Seoul. The major findings of this study were as follows. First, there were significant differences in mothers' attachment levels, types of verbal control, and infants' language development depending on the mothers' employment status. Secondly, to assess the relative influences of two variables which were significantly associated with infants' language development, the sociodemographic variables of mothers and infants, including infants' age and mothers' employment status, were controlled in order to conduct hierarchical regression analysis. The results revealed that imperative-oriented verbal controls, person-oriented verbal controls, and contact seeking all influenced infants' overall language development.

A study on the design of control unit for playback-type industrial robot (기억재생식 산업용로봇트의 제어부 설계에 관한 연구)

  • 송상섭;김승필;변증남
    • 전기의세계
    • /
    • v.29 no.7
    • /
    • pp.460-470
    • /
    • 1980
  • The design of a control unit for a playback-type industrial robot is studied. Implemented for the cylindrical-coordinate type industrial robot with 5 degrees of freedom, the control unit constructed for the study consists of (i) z-80 .mu.p-based .mu.-computer control system (ii) Teach-Box for work command, and (iii) various softwares for generating signals for servo driving unit and operating the robot as playback-type. Softwares are developed by using high level Basic Language and low level z-80 Assembly Language for ease of programming and speed of program execution. To show the effectiveness, and example is included.

  • PDF

FSM Synthesis from High-Level Descriptions (상위 수준 기술로부터 순차 회로의 자동 생성)

  • 황선영;유진수
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.12
    • /
    • pp.1906-1915
    • /
    • 1990
  • A synthesis system generating sequential circuits from a high-level hardware descdription language CHDL, modelling language for Thor functional/behavioral simulator, is developed. In this paper, we describe the semantic analysis process, state minimization and state assignment algorithms. proposed assignment algorithm generates optimal state vectors using constraint matrix and similarity graph. Expremental results for MCNC benchmarks, standard test circuits, show that the system inplementing the proposed algorithms can be a viable tool for designing large finite state machines.

  • PDF

Comparison of Cognitive Loads between Koreans and Foreigners in the Reading Process

  • Im, Jung Nam;Min, Seung Nam;Cho, Sung Moon
    • Journal of the Ergonomics Society of Korea
    • /
    • v.35 no.4
    • /
    • pp.293-305
    • /
    • 2016
  • Objective: This study aims to measure cognitive load levels by analyzing the EEG of Koreans and foreigners, when they read a Korean text with care selected by level from the grammar and vocabulary aspects, and compare the cognitive load levels through quantitative values. The study results can be utilized as basic data for more scientific approach, when Korean texts or books are developed, and an evaluation method is built, when the foreigners encounter them for learning or an assignment. Background: Based on 2014, the number of the foreign students studying in Korea was 84,801, and they increase annually. Most of them are from Asian region, and they come to Korea to enter a university or a graduate school in Korea. Because those foreign students aim to learn within Universities in Korea, they receive Korean education from their preparation for study in Korea. To enter a university in Korea, they must acquire grade 4 or higher level in the Test of Proficiency in Korean (TOPIK), or they need to complete a certain educational program at each university's affiliated language institution. In such a program, the learners of the Korean language receive Korean education based on texts, except speaking domain, and the comprehension of texts can determine their academic achievements in studying after they enter their desired schools (Jeon, 2004). However, many foreigners, who finish a language course for the short-term, and need to start university study, cannot properly catch up with university classes requiring expertise with the vocabulary and grammar levels learned during the language course. Therefore, reading education, centered on a strategy to understand university textbooks regarded as top level reading texts to the foreigners, is necessary (Kim and Shin, 2015). This study carried out an experiment from a perspective that quantitative data on the readers of the main player of reading education and teaching materials need to be secured to back up the need for reading education for university study learners, and scientifically approach educational design. Namely, this study grasped the difficulty level of reading through the measurement of cognitive loads indicated in the reading activity of each text by dividing the difficulty of a teaching material (book) into eight levels, and the main player of reading into Koreans and foreigners. Method: To identify cognitive loads indicated upon reading Korean texts with care by Koreans and foreigners, this study recruited 16 participants (eight Koreans and eight foreigners). The foreigners were limited to the language course students studying the intermediate level Korean course at university-affiliated language institutions within Seoul Metropolitan Area. To identify cognitive load, as they read a text by level selected from the Korean books (difficulty: eight levels) published by King Sejong Institute (Sejonghakdang.org), the EEG sensor was attached to the frontal love (Fz) and occipital lobe (Oz). After the experiment, this study carried out a questionnaire survey to measure subjective evaluation, and identified the comprehension and difficulty on grammar and words. To find out the effects on schema that may affect text comprehension, this study controlled the Korean texts, and measured EEG and subjective satisfaction. Results: To identify brain's cognitive load, beta band was extracted. As a result, interactions (Fz: p =0.48; Oz: p =0.00) were revealed according to Koreans and foreigners, and difficulty of the text. The cognitive loads of Koreans, the readers whose mother tongue is Korean, were lower in reading Korean texts than those of the foreigners, and the foreigners' cognitive loads became higher gradually according to the difficulty of the texts. From the text four, which is intermediate level in difficulty, remarkable differences started to appear in comparison of the Koreans and foreigners in the beginner's level text. In the subjective evaluation, interactions were revealed according to the Koreans and foreigners and text difficulty (p =0.00), and satisfaction was lower, as the difficulty of the text became higher. Conclusion: When there was background knowledge in reading, namely schema was formed, the comprehension and satisfaction of the texts were higher, although higher levels of vocabulary and grammar were included in the texts than those of the readers. In the case of a text in which the difficulty of grammar was felt high in the subjective evaluation, foreigners' cognitive loads were also high, which shows the result of the loads' going up higher in proportion to the increase of difficulty. This means that the grammar factor functions as a stress factor to the foreigners' reading comprehension. Application: This study quantitatively evaluated the cognitive loads of Koreans and foreigners through EEG, based on readers and the text difficulty, when they read Korean texts. The results of this study can be used for making Korean teaching materials or Korean education content and topic selection for foreigners. If research scope is expanded to reading process using an eye-tracker, the reading education program and evaluation method for foreigners can be developed on the basis of quantitative values.