• 제목/요약/키워드: 맞춤법

Search Result 95, Processing Time 0.027 seconds

Automatic Evaluation of Elementary School English Writing Based on Recurrent Neural Network Language Model (순환 신경망 기반 언어 모델을 활용한 초등 영어 글쓰기 자동 평가)

  • Park, Youngki
    • Journal of The Korean Association of Information Education
    • /
    • v.21 no.2
    • /
    • pp.161-169
    • /
    • 2017
  • We often use spellcheckers in order to correct the syntactic errors in our documents. However, these computer programs are not enough for elementary school students, because their sentences are not smooth even after correcting the syntactic errors in many cases. In this paper, we introduce an automated method for evaluating the smoothness of two synonymous sentences. This method uses a recurrent neural network to solve the problem of long-term dependencies and exploits subwords to cope with the rare word problem. We trained the recurrent neural network language model based on a monolingual corpus of about two million English sentences. In our experiments, the trained model successfully selected the more smooth sentences for all of nine types of test set. We expect that our approach will help in elementary school writing after being implemented as an application for smart devices.

Ultrasensitive laser interferometer for precision measurement of small vibration displacement (고감도 레이저 간섭계를 이용한 미소 진동 진폭의 정밀측정)

  • 서상준
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.12 no.3
    • /
    • pp.440-449
    • /
    • 1988
  • Small vibration displacements may be measured by optical interferometers, based on the Michelson method. The standard Michelson interferometer works well when the mirror displacements are relatively large compared to the optical wavelength. But it does not work for displacements less than approximately a quater of optical wavelength. Several multiple reflection laser interferometers, simply modified standard Michelson interferometer, have been developed to decrease the minimum detectable limits. Among these a relatively simple and easy multiple reflection system is used to measure the small vibration displacements. This multiple reflection system is constructed with a right angle prism and a convex lens. Therefore this system makes it possible to measure a vibration displacement of the small area on the vibrating structure. The fringe interpolation method and curve fitting method are used to determine accurately the small vibration displacements from the measured interference fringe patterns. Also computer simulation technique is used to check the accuracies of these method. According to the results of the computer simulation technique, the curve fitting method is more accurate than the fringe interpolation method. The optically measured results are in good agreement with those of the standard accelerometer with high accuracy and it is possible to measure the peak vibration displacement as small as 9.01nm using multiple reflection system and curve fitting method.

Research on Methods for Processing Nonstandard Korean Words on Social Network Services (소셜네트워크서비스에 활용할 비표준어 한글 처리 방법 연구)

  • Lee, Jong-Hwa;Le, Hoanh Su;Lee, Hyun-Kyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.3
    • /
    • pp.35-46
    • /
    • 2016
  • Social network services (SNS) that help to build relationship network and share a particular interest or activity freely according to their interests by posting comments, photos, videos,${\ldots}$ on online communities such as blogs have adopted and developed widely as a social phenomenon. Several researches have been done to explore the pattern and valuable information in social networks data via text mining such as opinion mining and semantic analysis. For improving the efficiency of text mining, keyword-based approach have been applied but most of researchers argued the limitations of the rules of Korean orthography. This research aims to construct a database of non-standard Korean words which are difficulty in data mining such abbreviations, slangs, strange expressions, emoticons in order to improve the limitations in keyword-based text mining techniques. Based on the study of subjective opinions about specific topics on blogs, this research extracted non-standard words that were found useful in text mining process.

Automatic Product Feature Extraction for Efficient Analysis of Product Reviews Using Term Statistics (효율적인 상품평 분석을 위한 어휘 통계 정보 기반 평가 항목 추출 시스템)

  • Lee, Woo-Chul;Lee, Hyun-Ah;Lee, Kong-Joo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.497-502
    • /
    • 2009
  • In this paper, we introduce an automatic product feature extracting system that improves the efficiency of product review analysis. Our system consists of 2 parts: a review collection and correction part and a product feature extraction part. The former part collects reviews from internet shopping malls and revises spoken style or ungrammatical sentences. In the latter part, product features that mean items that can be used as evaluation criteria like 'size' and 'style' for a skirt are automatically extracted by utilizing term statistics in reviews and web documents on the Internet. We choose nouns in reviews as candidates for product features, and calculate degree of association between candidate nouns and products by combining inner association degree and outer association degree. Inner association degree is calculated from noun frequency in reviews and outer association degree is calculated from co-occurrence frequency of a candidate noun and a product name in web documents. In evaluation results, our extraction method showed an average recall of 90%, which is better than the results of previous approaches.

Design and Implementation of Vocal Sound Variation Rules for Korean Language (한국어 음운 변동 처리 규칙의 설계 및 구현)

  • Lee, Gye-Young
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.3
    • /
    • pp.851-861
    • /
    • 1998
  • Korean language is to be characterized by the rich vocal sound variation. In order to increase the probability of vocal sound recognition and to provide a natural vocal sound synthesis, a systematic and thorough research into the characteristics of Korean language including its vocal sound changing rules is required. This paper addresses an effective way of vocal sound recognition and synthesis by providing the design and implementation of the Korean vocal sound variation rule. The regulation we followed for the design of the vocal sound variation rule is the Phonetic Standard(Section 30. Chapter 7) of the Korean Orthographic Standards. We have first factor out rules for each regulations, then grouped them into 27 groups for eaeh final-consonant. The Phonological Change Processing System suggested in the paper provides a fast processing ability for vocal sound variation by a single application of the rule. The contents of the process for information augmented to words or the stem of innected words are included in the rules. We believe that the Phonological Change Processing System will facilitate the vocal sound recognition and synthesis by the sentence. Also, this system may be referred as an example for similar research areas.

  • PDF

A Study on the Calculation of Resistance of the Ship to be Towed and Towline Tension (선박의 예인저항 및 예인삭의 장력 계산에 관한 연구)

  • Nam, Taek-Kun;Jung, Chang-Hyun;Jeong, Jung-Sik
    • Journal of Navigation and Port Research
    • /
    • v.36 no.8
    • /
    • pp.607-612
    • /
    • 2012
  • In this paper, calculation methods of resistance of the ship to be towed and towline tension are discussed. When the vessel is fallen into dead ship condition then appropriate towing force have to be estimated to move the vessel from accident place to safe area. In this research, resistance of the ship to be towed and the tow hawser were considered to estimate total towline tension. Polynomial interpolation method is also applied to estimate additional hydrodynamic resistance of towline. Finally, UI program to calculate the resistance and total towline tension is developed. The developed program based on the research results is effective and convenient to use.

Analysis of Mistakes Made in Using Loan Words in Domestic Hairstyling-related Academic Papers (국내 헤어 논문 외래어 오류 실태 분석)

  • Lee, Young-a;Lee, Jae-sook
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.449-456
    • /
    • 2019
  • This study attempted to improve the quality of hairstyling-related studies and provide basic data for future studies on hairstyling terms through analysis of cosmetology-related loan words used in hairstyling theses among recent cosmetology papers. For data collection to derive valid conclusions, the signatures of a total of 1,980 academic papers collected after typing in the keyword 'Hair' at the Research Information Sharing Service (http://www.riss.kr) were analyzed. The results show that researchers in hairstyling seem not to pay close attention to the correct use of foreign loan words. Therefore, the study results would be very helpful to the development of future cosmetology studies. The correct notation and use of foreign loanwords should be further encouraged.

A Study on Phenomenon 'Play of Words' in Modern Russian Advertising Language (현대 러시아 광고언어에 있어서의 '언어유희' 현상에 대한 연구)

  • Kim, Sung Wan
    • Cross-Cultural Studies
    • /
    • v.42
    • /
    • pp.241-260
    • /
    • 2016
  • The purpose of this article is to represent the types of advertising in the modern Russian language as 'Play of Words' (игра слов). The causal reason for this phenomenon is studied from the result of certain characteristics of advertising. The definition and characteristics of the language of the advertisement are analyzed in achieving the goal, as these factors reveal how language is used to maximize the effectiveness of the advertising. Academic research is needed in the collaborative fields of linguistics, psychology, economics, sociology, marketing, literature, art, and music. Modern advertisement is mixed with semiotic objects that consist of display, sound, and texts. While this study is not complete, the acknowledgement of the phenomenon 'Play of Words' between the creators of advertising and the consumer is undeniable. On one hand, advertising is recognized by linguists as the main factor that destroys the literary language. It represents the distortion of a standard language norm, as opposed to formal linguistic means used in advertising. In this research, we pay attention to the frequent use of foreign language borrowings and incorrect representation of foreign words, slang and jargon, that occur in misspelled usage of literary norms. The features that are revealed in this article are helpful to understand the purpose of advertising.

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Detecting Spelling Errors by Comparison of Words within a Document (문서내 단어간 비교를 통한 철자오류 검출)

  • Kim, Dong-Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.12
    • /
    • pp.83-92
    • /
    • 2011
  • Typographical errors by the author's mistyping occur frequently in a document being prepared with word processors contrary to usual publications. Preparing this online document, the most common orthographical errors are spelling errors resulting from incorrectly typing intent keys to near keys on keyboard. Typical spelling checkers detect and correct these errors by using morphological analyzer. In other words, the morphological analysis module of a speller tries to check well-formedness of input words, and then all words rejected by the analyzer are regarded as misspelled words. However, if morphological analyzer accepts even mistyped words, it treats them as correctly spelled words. In this paper, I propose a simple method capable of detecting and correcting errors that the previous methods can not detect. Proposed method is based on the characteristics that typographical errors are generally not repeated and so tend to have very low frequency. If words generated by operations of deletion, exchange, and transposition for each phoneme of a low frequency word are in the list of high frequency words, some of them are considered as correctly spelled words. Some heuristic rules are also presented to reduce the number of candidates. Proposed method is able to detect not syntactic errors but some semantic errors, and useful to scoring candidates.