• Title/Summary/Keyword: Word learning system

Search Result 202, Processing Time 0.023 seconds

Key-word Error Correction System using Syllable Restoration Algorithm (음절 복원 알고리즘을 이용한 핵심어 오류 보정 시스템)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.10
    • /
    • pp.165-172
    • /
    • 2010
  • There are two method of error correction in vocabulary recognition system. one error pattern matting base on method other vocabulary mean pattern base on method. They are a failure while semantic of key-word problem for error correction. In improving, in this paper is propose system of key-word error correction using algorithm of syllable restoration. System of key-word error correction by processing of semantic parse through recognized phoneme meaning. It's performed restore by algorithm of syllable restoration phoneme apply fluctuation before word. It's definitely parse of key-word and reduced of unrecognized. Find out error correction rate using phoneme likelihood and confidence for system parse. When vocabulary recognition perform error correction for error proved vocabulary. system performance comparison as a result of recognition improve represent 2.3% by method using error pattern learning and error pattern matting, vocabulary mean pattern base on method.

Design and Implementation of Early Childhood Learning Assistant System using Block Coding Technique (블록 코딩기법을 이용한 유아 학습 보조 시스템의 설계 및 구현)

  • Park, Sun-Yi;Park, Hee-Sook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.41-48
    • /
    • 2022
  • As the COVID-19 situation continues, early childhood are unable to attend early childhood education institutions and are spending more time with their parents at home. Parents are faced with a situation where they have to spend a lot of time at home to teach their children for Korean word learning or play activities. This act as a lot of psychological burden and stress for parents. In order to relieve the psychological burden of parents, the design and implementation of early childhood learning assistant system that can support Korea word education and play activities using artificial intelligence blocks of block coding technique was proposed in this study. The usage of a proposed system can not only reduce the burden on parents for their children's learning, but also can be actively used in the field of early childhood education so many learning effects can be expected.

Corpus-Based Ontology Learning for Semantic Analysis (의미 분석을 위한 말뭉치 기반의 온톨로지 학습)

  • 강신재
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.9 no.1
    • /
    • pp.17-23
    • /
    • 2004
  • This paper proposes to determine word senses in Korean language processing by corpus-based ontology learning. Our approach is a hybrid method. First, we apply the previously-secured dictionary information to select the correct senses of some ambiguous words with high precision, and then use the ontology to disambiguate the remaining ambiguous words. The mutual information between concepts in the ontology was calculated before using the ontology as knowledge for disambiguating word senses. If mutual information is regarded as a weight between ontology concepts, the ontology can be treated as a graph with weighted edges, and then we locate the least weighted path from one concept to the other concept. In our practical machine translation system, our word sense disambiguation method achieved a 9% improvement over methods which do not use ontology for Korean translation.

  • PDF

A Single-Player Car Driving Game-based English Vocabulary Learning System (1인용 자동차 주행 게임 기반의 영어 단어 학습 시스템)

  • Kim, Sangchul;Park, Hyogeun
    • Journal of Korea Game Society
    • /
    • v.15 no.2
    • /
    • pp.95-104
    • /
    • 2015
  • Many games for English vocabulary learning, such as hangman, cross puzzle, matching, etc, have been developed which are of board-type or computer game-type. Most of these computer games are adapting strategy-style game plays so that there is a limit on giving the fun, a nature of games, to learners who do not like games of this style. In this paper, a system for memorizing new English words is proposed which is based on a single-player car racing game targeting youths and adults. In the game, the core of our system, a learner drives a car and obtains game points by colliding with English word texts like game items appearing on the track. The learner keeps on viewing English words being exposed on the track while driving, resulting in memorizing those words according to a learning principle stating viewing is memorization. To our experiment, the effect of memorizing English words by our car racing game is good, and the degree of satisfaction with our system as a English vocabulary learning tool is reasonably high. Also, previous word games are suitable for the memory enforcement of English words but our game can be used for the memorization of new words.

A review of Chinese named entity recognition

  • Cheng, Jieren;Liu, Jingxin;Xu, Xinbin;Xia, Dongwan;Liu, Le;Sheng, Victor S.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2012-2030
    • /
    • 2021
  • Named Entity Recognition (NER) is used to identify entity nouns in the corpus such as Location, Person and Organization, etc. NER is also an important basic of research in various natural language fields. The processing of Chinese NER has some unique difficulties, for example, there is no obvious segmentation boundary between each Chinese character in a Chinese sentence. The Chinese NER task is often combined with Chinese word segmentation, and so on. In response to these problems, we summarize the recognition methods of Chinese NER. In this review, we first introduce the sequence labeling system and evaluation metrics of NER. Then, we divide Chinese NER methods into rule-based methods, statistics-based machine learning methods and deep learning-based methods. Subsequently, we analyze in detail the model framework based on deep learning and the typical Chinese NER methods. Finally, we put forward the current challenges and future research directions of Chinese NER technology.

Neural Theorem Prover with Word Embedding for Efficient Automatic Annotation (효율적인 자동 주석을 위한 단어 임베딩 인공 신경 정리 증명계 구축)

  • Yang, Wonsuk;Park, Hancheol;Park, Jong C.
    • Journal of KIISE
    • /
    • v.44 no.4
    • /
    • pp.399-410
    • /
    • 2017
  • We present a system that automatically annotates unverified Web sentences with information from credible sources. The system turns to neural theorem proving for an annotating task for cancer related Wikipedia data (1,486 propositions) with Korean National Cancer Center data (19,304 propositions). By switching the recursive module in a neural theorem prover to a word embedding module, we overcome the fundamental problem of tremendous learning time. Within the identical environment, the original neural theorem prover was estimated to spend 233.9 days of learning time. In contrast, the revised neural theorem prover took only 102.1 minutes of learning time. We demonstrated that a neural theorem prover, which encodes a proposition in a tensor, includes a classic theorem prover for exact match and enables end-to-end differentiable logic for analogous words.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

A Design of Web-Based System for Mathematical Word Problem Representation Ability Improvement (수학 문장제 표상능력 향상을 위한 웹 기반 시스템의 설계)

  • Park, Jung-Sik;Kho, Dae-Ghon
    • Journal of The Korean Association of Information Education
    • /
    • v.5 no.2
    • /
    • pp.185-196
    • /
    • 2001
  • Elementary school students feel more difficult the mathematical word problems than the numberical formula. I think that this reason isn't the ability of mathematical calculation but the problems representation. It is demanded exactly understanding about the requirements of problem for improving ability of the mathematical word problem representation. It is necessary that we take multimedia data and communication for this, because web advances multimedia materialization and promotes mutual communication, then it gives us with the most environment for word problem representation learning. According to, this thesis is designed web-based system to improve ability of the mathematical word problem representation, applied the sixth grade it experimentally.

  • PDF

Construction of Korean Knowledge Base Based on Machine Learning from Wikipedia (위키백과로부터 기계학습 기반 한국어 지식베이스 구축)

  • Jeong, Seok-won;Choi, Maengsik;Kim, Harksoo
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.1065-1070
    • /
    • 2015
  • The performance of many natural language processing applications depends on the knowledge base as a major resource. WordNet, YAGO, Cyc, and BabelNet have been extensively used as knowledge bases in English. In this paper, we propose a method to construct a YAGO-style knowledge base automatically for Korean (hereafter, K-YAGO) from Wikipedia and YAGO. The proposed system constructs an initial K-YAGO simply by matching YAGO to info-boxes in Wikipedia. Then, the initial K-YAGO is expanded through the use of a machine learning technique. Experiments with the initial K-YAGO shows that the proposed system has a precision of 0.9642. In the experiments with the expanded part of K-YAGO, an accuracy of 0.9468 was achieved with an average macro F1-measure of 0.7596.

The evaluation of Word Processors by Learning Model (학습모형을 이용한 워드프로세서의 평가방법 개발)

  • 손일문;홍상우;이상철
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.20 no.41
    • /
    • pp.203-212
    • /
    • 1997
  • The interface of computer software has to promote human-computer interaction. The one quality of interface to promote HCI should be evaluted with regard to user's information processing. The usability of interface is one of the main components of it's quality, and it is straightforwardly concerned with learnability, especially when users want to use a software at the first stage. In this paper, word processors, wide spreadly used in OA environments is studied in respect to menu structure on the interface. An cognitive menu structure is suggested by user's conceptual network of the main functions of word processor. Two word processors is selected to compare with the cognitive menu structure and to evalute learnabilities by teaming model.

  • PDF